Performance of ORDER BY vs. list.sort() vs. sorted()

1,545 views
Skip to first unread message

jens.t...@gmail.com

unread,
Feb 28, 2018, 4:02:29 PM2/28/18
to sqlalchemy
Hello,

I’m curious about your experience with sorting the results of all() queries which return a list. It seems to me that there are three ways of sorting such result lists:
  1. Use an ORDER BY in the query and let the database do the sorting.
  2. Use Python’s list.sort() and sort the result list in-place.
  3. Use Python’s sorted() function and construct a new and sorted list from the result list.
Is there one preferable over the other, particularly wrt. performance? Is sorting on the db side more advantageous than sorting the result list later? (For list.sort() vs. sorted() see here.)

Thanks!
Jens

Matt Zagrabelny

unread,
Feb 28, 2018, 4:42:44 PM2/28/18
to sqlal...@googlegroups.com
On Wed, Feb 28, 2018 at 3:02 PM, <jens.t...@gmail.com> wrote:
Hello,

I’m curious about your experience with sorting the results of all() queries which return a list. It seems to me that there are three ways of sorting such result lists:
  1. Use an ORDER BY in the query and let the database do the sorting.
I would guess that the DB would be the fastest if one were to benchmark sorting speeds. I have not.
 
  1. Use Python’s list.sort() and sort the result list in-place.
  2. Use Python’s sorted() function and construct a new and sorted list from the result list.

I would guess that both Python sort incantations are of similar performance values. Here is a Stack Overflow thread about sort() vs sorted()


-m

Jonathan Vanasco

unread,
Mar 2, 2018, 1:32:16 AM3/2/18
to sqlalchemy
This is going to wildly depend on how many things are being sorted, and what those things are.  this topic usually a premature optimization or "you're doing it wrong".

Imagine this query in Postgres:

   SELECT * FROM records WHERE ORDER BY timestamp_desc;

If there are 1,000 items in the database, whether you sort in Python or Postgres is irrelevant – the speed will be negligible.

If there are 1,000,000 items in the database, then database sorting is probably faster, as you won't have to do the operations in Python.

But consider this slightly different query where a limit is added, which would happen before the sort:

   SELECT * FROM records WHERE ORDER BY timestamp_desc LIMIT 100;

You need to sort in the database, so re-sorting in python is largely irrelevant – but if you do that, it will likely be negligible.

Reply all
Reply to author
Forward
0 new messages