Possible thread-safety regression for sitemaps in r13980

1 view
Skip to first unread message

Justin Bronn

unread,
Oct 10, 2010, 1:27:29 PM10/10/10
to Django developers
After r13980, GIS sitemaps were broken, and when attempting to fix I
had questions about the underlying implementation of the changeset:

http://code.djangoproject.com/changeset/13980/django/trunk/django/contrib/sitemaps/views.py

In the loop iterating over the `sitemaps` argument (L12), the `site`
value is a global Sitemap (or subclass thereof) instance typically
defined in a user module. While adding the missing `site.request =
request` assignment in the GIS sitemaps code enables tests to pass --
this solution appears thread unsafe because we are assigning, on a per-
request basis to the sitemap view, the `request` attribute to a module-
level global. What if two different search engines request the
sitemap index at the same time? Each would be clobbering over the
`request` attribute for the other, producing inconsistent sitemaps.

First, I'd like more eyes to confirm/deny my interpretation here.
Second, if this is confirmed, how should we solve it -- the situation
is compounded by the fact this changeset fixes 7 tickets. Luke, is
there anyway to decouple this from fixes for the other tickets?

-Justin

Gabriel Hurley

unread,
Oct 10, 2010, 6:17:24 PM10/10/10
to Django developers
I believe you are correct, since sitemaps are global objects, this
would not be thread-safe. Fixing it should be fairly easy, however.

Setting self.request was only necessary for two reasons:

1. Sitemap.get_urls() uses it, but either the request or the
current_site variable could be easily be passed as a parameter to
get_urls. Personally I'd lean towards passing around a current_site
variable rather than request objects... Having read through all the
arguments about thread-safety for the Class-Based Views feature, it
seems like explicit passing should solve the thread-safety issue.

2. FlatPageSitemap.items() uses it... however, flatpages is directly
coupled to the sites framework with an M2M field, so the better way to
do this is to filter the FlatPage objects based on the current site.
You'd never even get to the point of items() being called if you don't
have contrib.sites installed so protecting against not having it is
useless.

I've opened a ticket for it here:

http://code.djangoproject.com/ticket/14433

and added a patch which I believe should restore thread safety.

- Gabriel


On Oct 10, 10:27 am, Justin Bronn <jbr...@gmail.com> wrote:
> After r13980, GIS sitemaps were broken, and when attempting to fix I
> had questions about the underlying implementation of the changeset:
>
> http://code.djangoproject.com/changeset/13980/django/trunk/django/con...

Luke Plant

unread,
Oct 11, 2010, 8:07:03 AM10/11/10
to django-d...@googlegroups.com
On Sun, 2010-10-10 at 15:17 -0700, Gabriel Hurley wrote:
> I believe you are correct, since sitemaps are global objects, this
> would not be thread-safe. Fixing it should be fairly easy, however.
>
> Setting self.request was only necessary for two reasons:
>
> 1. Sitemap.get_urls() uses it, but either the request or the
> current_site variable could be easily be passed as a parameter to
> get_urls. Personally I'd lean towards passing around a current_site
> variable rather than request objects... Having read through all the
> arguments about thread-safety for the Class-Based Views feature, it
> seems like explicit passing should solve the thread-safety issue.

Yes, passing the site seems like a better option. We are indeed hitting
exactly the problem we have hopefully avoided with the class-based
views, but can't solve in the same way here due to backwards
compatibility.

> 2. FlatPageSitemap.items() uses it... however, flatpages is directly
> coupled to the sites framework with an M2M field, so the better way to
> do this is to filter the FlatPage objects based on the current site.
> You'd never even get to the point of items() being called if you don't
> have contrib.sites installed so protecting against not having it is
> useless.

Yes - a RequestSite cannot have a 'flatpages_set' attribute, so there is
no need for FlatPageSitemap.items to be given a request or a RequestSite
object.

There is still the problem that the current site (or request) is not
available to the Sitemap.items() method, or the Sitemap.paginator
property, which some people might want. We can't fix this without a lot
of ugliness, like ignoring the paginator property for a start, so I'm
going to commit your fix as it is.

Thanks.

Luke

--
"God demonstrates his love towards us in this, that while we were
still sinners, Christ died for us." (Romans 5:8)

Luke Plant || http://lukeplant.me.uk/

Reply all
Reply to author
Forward
0 new messages