Best way to insert sorted in a list

SherjilOzair

unread,

Jun 17, 2011, 4:53:10 PM6/17/11

to

There are basically two ways to go about this.
One is, to append the new value, and then sort the list.
Another is to traverse the list, and insert the new value at the
appropriate position.

The second one's complexity is O(N), while the first one's is O(N *
log N).

Still, the second one works much better, because C code is being used
instead of pythons.

Still, being a programmer, using the first way (a.insert(x);
a.sort()), does not feel right.

What has the community to say about this ? What is the best (fastest)
way to insert sorted in a list ?

Shashank Singh

unread,

Jun 17, 2011, 5:02:14 PM6/17/11

to SherjilOzair, pytho...@python.org

On Sat, Jun 18, 2011 at 2:23 AM, SherjilOzair <sherji...@gmail.com> wrote:
> There are basically two ways to go about this.
> One is, to append the new value, and then sort the list.
> Another is to traverse the list, and insert the new value at the
> appropriate position.
>
> The second one's complexity is O(N), while the first one's is O(N *
> log N).

Correct me if I am wrong here but isn't the second one is O(log N)?
Binary search?
That is when you have an already sorted list from somewhere and you
are inserting just one new value.
In case you are building the whole list yourself it's the same (N * log N)

>
> Still, the second one works much better, because C code is being used
> instead of pythons.
>
> Still, being a programmer, using the first way (a.insert(x);
> a.sort()), does not feel right.
>
> What has the community to say about this ? What is the best (fastest)
> way to insert sorted in a list ?

> --
> http://mail.python.org/mailman/listinfo/python-list
>

--
Regards
Shashank Singh
http://www.cse.iitb.ac.in/~shashanksingh

Ethan Furman

unread,

Jun 17, 2011, 5:31:12 PM6/17/11

to SherjilOzair, pytho...@python.org

SherjilOzair wrote:
> What has the community to say about this ? What is the best (fastest)
> way to insert sorted in a list ?

Check out the bisect module.

~Ethan~

Ian Kelly

unread,

Jun 17, 2011, 5:23:54 PM6/17/11

to Python

On Fri, Jun 17, 2011 at 3:02 PM, Shashank Singh
<shashank.s...@gmail.com> wrote:
> Correct me if I am wrong here but isn't the second one is O(log N)?
> Binary search?
> That is when you have an already sorted list from somewhere and you
> are inserting just one new value.

Finding the position to insert is O(log n), but the actual insert is
O(n). This is because Python lists are implemented with arrays and
everything after the inserted item has to be moved in memory to make
space for the insert.

Chris Torek

unread,

Jun 17, 2011, 5:48:24 PM6/17/11

to

In article <c63e771c-8968-4d7a...@35g2000prp.googlegroups.com>

SherjilOzair <sherji...@gmail.com> wrote:
>There are basically two ways to go about this.
>One is, to append the new value, and then sort the list.
>Another is to traverse the list, and insert the new value at the
>appropriate position.
>
>The second one's complexity is O(N), while the first one's is O(N *
>log N).

This is not quite right; see below.

>Still, the second one works much better, because C code is being used
>instead of pythons.
>
>Still, being a programmer, using the first way (a.insert(x);
>a.sort()), does not feel right.
>
>What has the community to say about this ? What is the best (fastest)
>way to insert sorted in a list ?

In this case, the "best" way is most likely "don't do that at all".

First, we should note that a python list() data structure is actually
an array. Thus, you can locate the correct insertion point pretty
fast, by using a binary or (better but not as generally applicable)
interpolative search to find the proper insertion point.

Having found that point, though, there is still the expense of
the insertion, which requires making some room in the array-that-
makes-the-list (I will use the name "a" as you did above):

position = locate_place_for_insert(a, the_item)
# The above is O(log n) for binary search,
# O(log log n) for interpolative search, where
# n is len(a).

a.insert(position, the_item)
# This is still O(log n), alas.

Appending to the list is much faster, and if you are going to
dump a set of new items in, you can do that with:

# wrong way:
# for item in large_list:
# a.append(item)
# right way, but fundamentally still the same cost (constant
# factor is much smaller due to built-in append())
a.append(large_list)

If len(large_list) is m, this is O(m). Inserting each item in
the "right place" would be O(m log (n + m)). But we still
have to sort:

a.sort()

This is O(log (n + m)), hence likely better than repeatedly inserting
in the correct place.

Depending on your data and other needs, though, it might be best
to use a red-black tree, an AVL tree, or a skip list. You might
also investigate radix sort, radix trees, and ternary search trees
(again depending on your data).
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html

Ethan Furman

unread,

Jun 17, 2011, 6:24:20 PM6/17/11

to Chris Torek, pytho...@python.org

Chris Torek wrote:
> Appending to the list is much faster, and if you are going to
> dump a set of new items in, you can do that with:
>
> # wrong way:
> # for item in large_list:
> # a.append(item)
> # right way, but fundamentally still the same cost (constant
> # factor is much smaller due to built-in append())
> a.append(large_list)

^- should be a.extend(large_list)

~Ethan~

Ian Kelly

unread,

Jun 17, 2011, 7:33:37 PM6/17/11

to Python

On Fri, Jun 17, 2011 at 3:48 PM, Chris Torek <nos...@torek.net> wrote:
> If len(large_list) is m, this is O(m). Inserting each item in
> the "right place" would be O(m log (n + m)). But we still
> have to sort:
>
> a.sort()
>
> This is O(log (n + m)), hence likely better than repeatedly inserting
> in the correct place.

Surely you mean O((n + m) log (n + m)).

Chris Torek

unread,

Jun 17, 2011, 8:46:44 PM6/17/11

to

In article <itgi3...@news2.newsguy.com> I wrote, in part:

>>Appending to the list is much faster, and if you are going to

>>dump a set of new items in, you can do that with: [...]

In article <mailman.96.13083486...@python.org>
Ethan Furman <et...@stoneleaf.us> wrote:

>> a.append(large_list)
> ^- should be a.extend(large_list)

Er, right. Posted in haste (had to get out the door). I also
wrote:

>> If len(large_list) is m, this is O(m). Inserting each item in
>> the "right place" would be O(m log (n + m)). But we still
>> have to sort:
>>
>> a.sort()

In article <mailman.98.13083536...@python.org>,

Ian Kelly <ian.g...@gmail.com> wrote:
>> This is O(log (n + m)), hence likely better than repeatedly inserting
>> in the correct place.

>Surely you mean O((n + m) log (n + m)).

Er, "maybe"? (It depends on the relative values of m and n, and
the underlying sort algorithm to some extent. Some algorithms are
better at inserting a relatively small number of items into a
mostly-sorted large list. As I recall, Shell sort does well with
this.) But generally, yes. See "posted in haste" above. :-)

There are a lot of other options, such as sorting just the list of
"items to be inserted", which lets you do a single merge pass:

# UNTESTED
def merge_sorted(it1, it2, must_copy = True):
"""
Merge two sorted lists/iterators it1 and it2.
Roughly equivalent to sorted(list(it2) + list(it2)),
except for attempts to be space-efficient.

You can provide must_copy = False if the two iterators
are already lists and can be destroyed for the purpose
of creating the result.
"""

# If it1 and it1 are deque objects, we don't need to
# reverse them, as popping from the front is efficient.
# If they are plain lists, popping from the end is
# required. If they are iterators or tuples we need
# to make a list version anyway. So:
if must_copy:
it1 = list(it1)
it2 = list(it2)

# Reverse sorted lists (it1 and it2 are definitely
# lists now) so that we can pop off the end.
it1.reverse()
it2.reverse()

# Now accumulate final sorted list. Basically, this is:
# take first (now last) item from each list, and put whichever
# one is smaller into the result. When either list runs
# out, tack on the entire remaining list (whichever one is
# non-empty -- if both are empty, the two extend ops are
# no-ops, so we can just add both lists).
#
# Note that we have to re-reverse them to get
# them back into forward order before extending.
result = []
while it1 and it2:
# Note: I don't know if it might be faster
# to .pop() each item and .append() the one we
# did not want to pop after all. This is just
# an example, after all.
last1 = it1[-1]
last2 = it2[-1]
if last2 < last1:
result.append(last2)
it2.pop()
else:
result.append(last1)
it1.pop()
it1.reverse()
it2.reverse()
result.extend(it1)
result.extend(it2)
return result

So, now if "a" is the original (sorted) list and "b" is the not-yet-
sorted list of things to add:

a = merge_sorted(a, sorted(b), must_copy = False)

will work, provided you are not required to do the merge "in place".
Use the usual slicing trick if that is necessary:

a[:] = merge_sorted(a, sorted(b), must_copy = False)

If list b is already sorted, leave out the sorted() step. If list
b is not sorted and is particularly long, use b.sort() to sort in
place, rather than making a sorted copy.

Steven D'Aprano

unread,

Jun 17, 2011, 10:27:07 PM6/17/11

to

On Fri, 17 Jun 2011 13:53:10 -0700, SherjilOzair wrote:

> What has the community to say about this ? What is the best (fastest)
> way to insert sorted in a list ?

if you're doing repeated insertions into an already sorted list, there's
no question that the bisect module is the right way to do it.

(Unless you have a very old version of Python, version 2.3 or older.)

>>> from timeit import Timer
>>> setup = """
... L = list(range(10000)) + list(range(10100, 30000))
... from bisect import insort
... def sort_insert(a, x):
... a.append(x)
... a.sort()
... """
>>> t_bisect = Timer("for i in range(10000, 10100): insort(L, i)", setup)
>>> t_sort = Timer("for i in range(10000, 10100): sort_insert(L, i)", setup)
>>>
>>> t_sort.timeit(number=100)
19.430757999420166
>>> t_bisect.timeit(number=100)
0.3741610050201416

(For those unfamiliar with timeit, small numbers are faster.)

--
Steven

Message has been deleted

Vito De Tullio

unread,

Jun 19, 2011, 10:21:47 AM6/19/11

to pytho...@python.org

SherjilOzair wrote:

> There are basically two ways to go about this.

[...]

> What has the community to say about this ? What is the best (fastest)
> way to insert sorted in a list ?

a third way maybe using a SkipList instead of a list

on http://infohost.nmt.edu/tcc/help/lang/python/examples/pyskip/ you can
find a pure python implementation (but afaik there are many others)

--
ZeD

logon...@gmail.com

unread,

Sep 8, 2017, 2:39:43 PM9/8/17

to

a = []
num = int(input('How many numbers: '))
for n in range(num):
numbers = int(input('Enter values:'))
a.append(numbers)

b = sorted(a)
print(b)
c = int(input("enter value:"))
for i in range(len(b)):
if b[i] > c:
index = i
break
d = b[:i] + [c] + b[i:]
print(d)

Steve D'Aprano

unread,

Sep 8, 2017, 3:00:58 PM9/8/17

to

On Sat, 9 Sep 2017 04:39 am, logon...@gmail.com wrote:

> On Saturday, June 18, 2011 at 2:23:10 AM UTC+5:30, SherjilOzair wrote:

.......................^^^^^^

You're replying to something six years old. Its doubtful the original poster is
still reading.

>> What has the community to say about this ? What is the best (fastest)
>> way to insert sorted in a list ?
>
> a = []
> num = int(input('How many numbers: '))
> for n in range(num):
> numbers = int(input('Enter values:'))
> a.append(numbers)
>
> b = sorted(a)
> print(b)
> c = int(input("enter value:"))
> for i in range(len(b)):
> if b[i] > c:
> index = i
> break
> d = b[:i] + [c] + b[i:]
> print(d)

Doing a linear search, followed by slicing and list concatenation, is not likely
to be the fastest method. If you think it is fast, that's because you have only
tested it on small lists. Try a list with (say) a million items.

Probably the best way is to use the bisect module to insert into a sorted list.
Or append to the end, then sort in place. Python's sort is *very* efficient
with almost sorted data.

--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

Stephan Houben

unread,

Sep 8, 2017, 3:09:21 PM9/8/17

to

Op 2017-09-08, logon...@gmail.com schreef <logon...@gmail.com>:

> On Saturday, June 18, 2011 at 2:23:10 AM UTC+5:30, SherjilOzair wrote:
>> There are basically two ways to go about this.
>> One is, to append the new value, and then sort the list.
>> Another is to traverse the list, and insert the new value at the
>> appropriate position.
>>
>> The second one's complexity is O(N), while the first one's is O(N *
>> log N).
>>
>> Still, the second one works much better, because C code is being used
>> instead of pythons.

Python uses the Timsort ( https://en.wikipedia.org/wiki/Timsort )
algorithm. Timsort is O(N) in the special case of a list of N elements
where the first N-1 are already sorted and the last one is arbitrary.

So appending the value and then calling sort() is in fact O(N) in Python
(hence asymptotically optimal), and also practically fast since the
sort() is implemented in C.

Stephan