Making __repr__ safe by default

164 views
Skip to first unread message

Patryk Zawadzki

unread,
May 18, 2013, 12:04:48 PM5/18/13
to django-d...@googlegroups.com
As suggested by Marc Tamlyn I am posting this here for discussion:

https://code.djangoproject.com/ticket/20448

tl;dr:

Currently __repr__() calls __unicode__() which can be a very bad
thing. You really don't want things like an exception being raised or
debugger being used to cause additional side effects (like executing a
database query inside __unicode__).

--
Patryk Zawadzki
I solve problems.

Anssi Kääriäinen

unread,
May 18, 2013, 12:47:05 PM5/18/13
to Django developers
There was very similar discussion recently (maybe in some ticket, I
don't remember where) about QuerySet.__repr__(). IIRC the conclusion
was that executing SELECT queries as part of __repr__() is OK. The
reasoning had four points:
1. This is the way it has always been
2. Due to #1, changes in __repr__() will cascade to a lot of
tutorials etc which use the current repr format.
3. SELECT queries should be safe (for QuerySet SELECT FOR UPDATE not
so safe...)
4. __repr__() is more informative the current way

So, changing the way Model.__repr__() works should consider those four
points. To me it seems that points #1 - #3 apply directly, maybe also
#4.

I think we can keep things as is. I do see that the current way can be
problematic in some cases, but at least for Model.__unicode__(), if
you know it is unsafe, then write your own safe __repr__(). And if it
is decided that this change is after all needed, then doing changes to
both QuerySet.__repr__() and Model.__repr__() should be done in one
go.

- Anssi

Patryk Zawadzki

unread,
May 19, 2013, 6:17:48 AM5/19/13
to django-d...@googlegroups.com

18 maj 2013 18:48, "Anssi Kääriäinen" <anssi.ka...@thl.fi> napisał(a):
> There was very similar discussion recently (maybe in some ticket, I
> don't remember where) about QuerySet.__repr__(). IIRC the conclusion
> was that executing SELECT queries as part of __repr__() is OK.

I've been bitten by that one already. At least with postgresql if you invalidate the transaction executing any query other than rollback will raise an exception. It's not something you plan for ahead and it's a real pain to find the real cause.

Karen Tracey

unread,
May 19, 2013, 11:42:17 AM5/19/13
to django-d...@googlegroups.com
I agree with Anssi, repr should stay as-is. I do a lot of shell/pdb work and I can't recall ever encountering a problem with "unsafe" repr. I think many people would find it annoying if suddenly repr would tell you no more than the pk of the object.

Karen

Patryk Zawadzki

unread,
Jun 3, 2013, 6:56:26 AM6/3/13
to django-d...@googlegroups.com
2013/5/19 Karen Tracey <kmtr...@gmail.com>:
> I agree with Anssi, repr should stay as-is. I do a lot of shell/pdb work and

In interactive mode it's just as easy to call __unicode__ as it is to
call __repr__. On the other hand __repr__ is used by automated tools
such as crash reporters (raven).

> I can't recall ever encountering a problem with "unsafe" repr.

We had it explode a number of times when coupled with postgres. In
that case instead of the original exception, all you get is:

psycopg2.ProgrammingError: current transaction is aborted, commands
ignored until end of transaction block

> I think many
> people would find it annoying if suddenly repr would tell you no more than
> the pk of the object.

I'd argue that the PK is actually more important than whatever
__unicode__ returns. It's really annoying to find a stack trace that
only contains product's title when you have tens of millions of
products on production and title is not an indexed field. And by that
time it's already too late to override __repr__. On the other hand
it's fairly easy to extend __repr__ to display other information if
you find it more useful than the PK.

Backward compatibility should not be a concern here as no code should
depend on whatever __repr__ chooses to return.

Patryk Zawadzki

unread,
Apr 7, 2017, 6:45:11 AM4/7/17
to Django developers (Contributions to Django itself)
Hey, it's been a number of years and I'd like to revisit this topic.

I still consider it an anti-pattern to have __repr__ make a call to self.__unicode__ or self.__str__ as these can (and very often do) refer to related database objects. The last thing you want to happen while debugging a crash is for the mere act of observation to result in more database queries being executed. __repr__ is assumed to be safe to call by most debugging tools and is called by all crash reporting tools I've worked with. Having __repr__ itself crash (for example when the database connection is invalid) while handling an exception in a production environment can lead to exceptions being silently ignored or to misleading/useless stack traces being reported.

PS: Backwards compatibility was brought up as a potential issue but __repr__ is not usually an API consumed by application code.

Cheers

jp...@yourlabs.org

unread,
Apr 7, 2017, 8:18:58 AM4/7/17
to Django developers (Contributions to Django itself)
I'd like to support this, I've seen my share of situations such as the one described by Patryk. Is there something we have been missing ?

Adam Johnson

unread,
Apr 7, 2017, 8:36:43 AM4/7/17
to django-d...@googlegroups.com
Patryk, when you say 'revisit', do you have links to the original discussion/tickets?

I also agree that __repr__ shouldn't be able to trigger extra DB queries in a standard setup.

On 7 April 2017 at 13:18, <jp...@yourlabs.org> wrote:
I'd like to support this, I've seen my share of situations such as the one described by Patryk. Is there something we have been missing ?

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/f0e2c9b1-39d2-462e-8b8a-4b1c0f4822c2%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Adam

Shai Berger

unread,
Apr 7, 2017, 9:20:03 AM4/7/17
to django-d...@googlegroups.com
On Friday 07 April 2017 15:35:55 Adam Johnson wrote:
> Patryk, when you say 'revisit', do you have links to the original
> discussion/tickets?

He was replying to a thread from 2013. You should have the link at the bottom
of every message.

>
> I also agree that __repr__ shouldn't be able to trigger extra DB queries in
> a standard setup.
>

... and this ties into my other suggestion: __repr__ could be implemented as,
basically,

with database_disabled:
return str(self)

Marco Silva

unread,
Apr 8, 2017, 12:13:40 PM4/8/17
to Django developers (Contributions to Django itself), pat...@room-303.com
going in the python docs we find this
If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).
so maybe something that returned ModelName(field1=value1, field2=value2 ...) where fields with references could return the id of the reference to avoid DB lookups 
Reply all
Reply to author
Forward
0 new messages