Lua table.sort invokes the comparing function with the same element

332 views
Skip to first unread message

Claudi

unread,
Jul 12, 2024, 5:36:09 AM7/12/24
to lua-l
(NOTE: this issue has been already reported on StackOverflow)

I have stumbled upon a strange behavior of `table.sort()`, in which the comparing function is invoked by passing the same array element in both parameters.

Consider the code below. I want to sort the array from highest to lowest value of the property `v` and, in case they are equal (and since `table.sort()` algorithm is not stable as of [the docs][1]), I want to sort them by the original indices of the elements.

```lua
-- The array to sort
local arr = {
  {id="A", v = 1},
  {id="B", v = 1},
  {id="C", v = 0},
  {id="D", v = 1},
  {id="E", v = 1}
}

-- A map containing the original index of the elements in the array: element => originalIndex
local original_indices = {}
for index, elem in ipairs(arr) do
  original_indices[elem] = index
end

-- Sort the array
table.sort(arr,
  function(a, b)
    assert(a ~= b, "Comparing the same element in the array!")
   
    -- Compare the value
    if a.v > b.v then
      return true
    elseif a.v < b.v then
      return false
    end
   
    -- Values are equal. Compare original indices, which should always
    -- decide the final order.
    local ia = original_indices[a]
    local ib = original_indices[b]
   
    if ia < ib then
      return true
    elseif ia > ib then
      return false
    end
   
    error("BUG! Comparing the same element in the array!")
  end
)
```


The expected result should be:
```lua
{
  {id="C", v = 0},
  {id="A", v = 1},
  {id="B", v = 1},
  {id="D", v = 1},
  {id="E", v = 1}
}
```

But, instead, I get an error because Lua is invoking the sorting function by passing the same element in both parameters, which should not.

Am I missing something? How can I get the expected result?

You can play around with the code [here][2].


  [1]: https://www.lua.org/manual/5.3/manual.html#pdf-table.sort
  [2]: https://onecompiler.com/lua/42jwnh2j2

Sainan

unread,
Jul 12, 2024, 7:09:45 AM7/12/24
to lu...@googlegroups.com
Have you considered that the issue might be with your sorting function? When I narrow it down to a minimum viable repro, I don't see the behaviour you are seeing:

local arr = {
{id=1},
{id=2},
{id=0},
{id=4},
{id=3},
}
table.sort(arr, function(a, b)
assert(a.id ~= b.id)
return a.id < b.id
end)
for _, elm in pairs(arr) do
print(elm.id)
end

Roberto Ierusalimschy

unread,
Jul 12, 2024, 9:43:44 AM7/12/24
to lu...@googlegroups.com
> (NOTE: this issue has been already reported on StackOverflow
> <https://stackoverflow.com/questions/78739507/lua-table-sort-invokes-the-comparing-function-with-the-same-element>
> )

And it was already answered there.


> I have stumbled upon a strange behavior of `table.sort()`, in which the
> comparing function is invoked by passing the same array element in both
> parameters.

The sort algorithm gives no guarantees that it will not compare an
element with itself. It says only that the comparison function returns
true when the first element must come before the second. If the
elements are equal, one doesn't have to come before the other, so
the function should return false.

-- Roberto

Scott Morgan

unread,
Jul 12, 2024, 9:49:49 AM7/12/24
to lu...@googlegroups.com
On 12/07/2024 12:09, 'Sainan' via lua-l wrote:
> Have you considered that the issue might be with your sorting function? When I narrow it down to a minimum viable repro, I don't see the behaviour you are seeing:
>

Lua 5.4.4 Copyright (C) 1994-2022 Lua.org, PUC-Rio
> arr = { 3,4,2,5,1 }
> table.sort(arr, function(a,b) print(a,b); return a>b; end)
1 3
2 3
1 2
4 2
5 2
2 2 <--
2 5
5 3
4 5
3 4
> return table.concat(arr,",")
5,4,3,2,1

Does compare the same item

However, reversing the comparison:

> arr = { 3,4,2,5,1 }
> table.sort(arr, function(a,b) print(a,b); return a<b; end)
1 3
2 1
3 2
4 2
2 5
2 4
2 1
3 5
4 3
5 4
> return table.concat(arr,",")
1,2,3,4,5

As expected.

Sainan

unread,
Jul 12, 2024, 10:06:17 AM7/12/24
to lu...@googlegroups.com
Hmm, indeed the assertion fails when attempting to sort in a descending order. That seems a bit strange to me, but as Roberto already pointed out, it is not a defect per-se.

l...@cdelord.fr

unread,
Jul 12, 2024, 5:29:30 PM7/12/24
to lu...@googlegroups.com

As explained on stackoverflow it is not unexpected to compare the same element.

The comparison function shall implement a strict comparison (i.e. <, not <=).

E.g.:

``` lua


local arr = {
    {id="A", v=1},
    {id="B", v=1},
    {id="C", v=0},
    {id="D", v=1},

    {id="E", v=1},
}
```

The index can be added to the array itself to avoid using an extra table:

``` lua
for idx, elem in ipairs(arr) do
    elem.idx = idx
end
```

Then the array can be sorted with:

``` lua
table.sort(arr, function(a, b)
    if a.v == b.v then
        -- same value => compare indices
        return a.idx < b.idx
    end
    -- compare values
    return b.v < a.v
end)
```
--
You received this message because you are subscribed to the Google Groups "lua-l" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lua-l/983abeaa-46af-4258-ba70-01ffb7843ae0n%40googlegroups.com.

Sainan

unread,
Jul 12, 2024, 5:46:00 PM7/12/24
to lu...@googlegroups.com
Did you just send an email into the past?! You better thow that banana in the trash!

Christophe Delord

unread,
Jul 12, 2024, 5:55:13 PM7/12/24
to 'Sainan' via lua-l

Le 12/07/2024 à 23:45, 'Sainan' via lua-l a écrit :
> Did you just send an email into the past?! You better thow that banana
> in the trash!


I sent this email a few hours ago and it took a few hours to be
delivered... Is this because I sent it with Thunderbird instead of using
the web interface of google?

Sainan

unread,
Jul 12, 2024, 5:59:46 PM7/12/24
to lu...@googlegroups.com
I am just as confused as you are. I'm using Protonmail and my emails seem to deliver instantly.

Rett Berg

unread,
Jul 14, 2024, 4:14:29 PM7/14/24
to lua-l
> The sort algorithm gives no guarantees that it will not compare an
element with itself.

While this is not incorrect I really don't understand why a sorting algorithm would ever do this. This feels like a performance bug, especially since checking the indexes would be done in C whereas running the comparison is done in Lua.

It looks like the code is in "auxsort" in ltablib.c. I might be interested in trying my hand at removing the self comparison if you would be interested in including it? It seems to me a separate algorithm (simple swap / not-swap) should be done when there are two or fewer items.

"Normal" quicksorts for only integers will fallback on insertion sort for ~25-50 items since it is faster in the cache -- that wouldn't be the case for Lua tables, but I think we could use a different algorithm for 1-2 (or maybe even 3?) items for a modest speed increase and avoiding these kind of cases.

Best,
Rett

Roberto Ierusalimschy

unread,
Jul 15, 2024, 11:54:43 AM7/15/24
to lu...@googlegroups.com
> "Normal" quicksorts for only integers will fallback on insertion sort for
> ~25-50 items since it is faster in the cache -- that wouldn't be the case
> for Lua tables, but I think we could use a different algorithm for 1-2 (or
> maybe even 3?) items for a modest speed increase and avoiding these kind of
> cases.

If you look at the code you will see that it already uses a different
"algorithm" for up to 3 elements, while choosing a suitable pivot.

-- Roberto

Rett Berg

unread,
Jul 15, 2024, 12:02:18 PM7/15/24
to lu...@googlegroups.com
If you look at the code you will see that it already uses a different
"algorithm" for up to 3 elements, while choosing a suitable pivot.

I'll take another look. I'm rewriting the code for fun, I'll send it over and you can include it if interested.

What of the point that comparing an index with itself is (at minimum) a performance bug? Do you agree?


--
You received this message because you are subscribed to a topic in the Google Groups "lua-l" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lua-l/Y61Xo2i4MGk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lua-l+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lua-l/20240715155437.GB595440%40arraial.inf.puc-rio.br.

Roberto Ierusalimschy

unread,
Jul 15, 2024, 12:47:43 PM7/15/24
to lu...@googlegroups.com
> What of the point that comparing an index with itself is (at minimum)
> a performance bug? Do you agree?

It deserves a better understanding, but I wouln't consider it a bug. I
for sure never thought that way when I wrote the code. In general,
the array may have repeated elements, so both the algorithm and the
comparison function should be ready to handle equal elements anyway.

The partition algorithm is tricky by itself, and Lua keeps some elements
on the stack---in particular the pivot---to avoid re-reading from the
array for each comparison. So, it is not obvious to the code which
indices are being compared at each comparison. Whether the extra
overhead nedded to avoid comparing an element to itself pays its price
is something to be seen. (In a quick test with 1e6 random elements,
less than 0.3% of the comparisons were between equal elements. When
the elements are already ordered, there were no comparisons between
equal elements. In reverse order, less than 0.1% were equal. When all
elements are equal, 100% of the comparisons were between equal
elements. :-)

-- Roberto
Reply all
Reply to author
Forward
0 new messages