Sorting both numerical and symbol items reasonably

47 views
Skip to first unread message

Duane Nykamp

unread,
Oct 18, 2014, 9:52:50 PM10/18/14
to sy...@googlegroups.com
I was using sympy's default_sort_order to sort objects.  However, I realized that it does not sort numerical quantities in order:

In [15]: sorted([-1, -1-sqrt(2),1+sqrt(2), pi, 4], key=default_sort_key)
Out[15]: [-1, 4, pi, 1 + sqrt(2), -sqrt(2) - 1]

The default sort key, lex, works in this case

In [16]: sorted([-1, -1-sqrt(2),1+sqrt(2), pi, 4])
Out[16]: [-sqrt(2) - 1, -1, 1 + sqrt(2), pi, 4]

but that doesn't work on symbolic objects.

Is there a way to sort that works on all sympy objects, symbolic or not, but still produces the normal order for numerical quantities?

The only thing I can think of is trying the lex sort key first, then reverting to default_sort_key on TypeError.  But, that has the undesirable effect of changing the relative order of numerical quantities once a symbolic quantity is added to the mix.

Any suggestions or pointers to a place where this is discussed already?

Thanks,
Duane

Duane Nykamp

unread,
Oct 18, 2014, 11:40:53 PM10/18/14
to sy...@googlegroups.com
Here's my solution for a customized sort key.  Does this seem reasonable?  I just copied the format for the sort key for numbers and applied it to anything that was real.  It seems to work how I'd want it to work.


def customized_sort_key(item, order=None):
   
    try:
        x=sympify(item)
    except:
        pass
    else:
        if x.is_real:
            return  ((1, 0, 'Number'), (0, ()), (), x)

    return default_sort_key(item,order)

Mateusz Paprocki

unread,
Oct 20, 2014, 6:48:47 AM10/20/14
to sympy
Hi,

On 19 October 2014 05:40, Duane Nykamp <dqny...@comcast.net> wrote:
> Here's my solution for a customized sort key. Does this seem reasonable? I
> just copied the format for the sort key for numbers and applied it to
> anything that was real. It seems to work how I'd want it to work.
>
>
> def customized_sort_key(item, order=None):
>
> try:
> x=sympify(item)
> except:
> pass
> else:
> if x.is_real:
> return ((1, 0, 'Number'), (0, ()), (), x)
>
> return default_sort_key(item,order)
>

This won't work, e.g.:

In [1]: r = Symbol('r', real=True)

In [2]: r.is_real
Out[2]: True

Really what you should be looking into are sort_key() methods defined
all over sympy. default_sort_key() is just a convenience function that
allows to handle non-Basic objects as well, thus being usable as a key
to sorted().

Defining the default sort key isn't that simple, e.g.:

In [1]: l2 = [1 - sqrt(2), 1, pi, 4, 1 + sqrt(2)]

In [2]: lx = [1 - sqrt(x), 1, pi, 4, 1 + sqrt(x)]

In [3]: l2I = map(lambda x: x + I, l2)

In [4]: lxI = map(lambda x: x + I, lx)

In [5]: sorted(l2, key=default_sort_key)
Out[5]:
⎡ ___ ___ ⎤
⎣1, 4, π, 1 + ╲╱ 2 , - ╲╱ 2 + 1⎦

In [6]: sorted(lx, key=default_sort_key)
Out[6]:
⎡ ___ ___ ⎤
⎣1, 4, π, - ╲╱ x + 1, ╲╱ x + 1⎦

In [7]: sorted(l2I, key=default_sort_key)
Out[7]:
⎡ ___ ___ ⎤
⎣1 + ⅈ, 4 + ⅈ, π + ⅈ, 1 + ╲╱ 2 + ⅈ, - ╲╱ 2 + 1 + ⅈ⎦

In [8]: sorted(lxI, key=default_sort_key)
Out[8]:
⎡ ___ ___ ⎤
⎣1 + ⅈ, 4 + ⅈ, π + ⅈ, - ╲╱ x + 1 + ⅈ, ╲╱ x + 1 + ⅈ⎦

Will your key provide the same level of uniformity? Perhaps you could,
doing fair amount of symbolic processing, using as_real_imag(), etc.,
but then you would increase cost of sort_key() considerably and, most
likely, make printing framework even slower than it already is.

There should be some old discussions regarding this on the mailing
list or in the issue tracker. At the time, it was a pretty big
undertaking (historically sort_key started as as_tuple_tree, so you
may also use this keyword).

Mateusz

>
>
>
> On Saturday, October 18, 2014 8:52:50 PM UTC-5, Duane Nykamp wrote:
>>
>> I was using sympy's default_sort_order to sort objects. However, I
>> realized that it does not sort numerical quantities in order:
>>
>> In [15]: sorted([-1, -1-sqrt(2),1+sqrt(2), pi, 4], key=default_sort_key)
>> Out[15]: [-1, 4, pi, 1 + sqrt(2), -sqrt(2) - 1]
>>
>> The default sort key, lex, works in this case
>>
>> In [16]: sorted([-1, -1-sqrt(2),1+sqrt(2), pi, 4])
>> Out[16]: [-sqrt(2) - 1, -1, 1 + sqrt(2), pi, 4]
>>
>> but that doesn't work on symbolic objects.
>>
>> Is there a way to sort that works on all sympy objects, symbolic or not,
>> but still produces the normal order for numerical quantities?
>>
>> The only thing I can think of is trying the lex sort key first, then
>> reverting to default_sort_key on TypeError. But, that has the undesirable
>> effect of changing the relative order of numerical quantities once a
>> symbolic quantity is added to the mix.
>>
>> Any suggestions or pointers to a place where this is discussed already?
>>
>> Thanks,
>> Duane
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/ade686f6-776c-4e6b-a34f-997918afa64c%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

Duane Nykamp

unread,
Oct 20, 2014, 10:24:01 AM10/20/14
to sy...@googlegroups.com
Thanks for the tip about making a Symbol real.  That is easily fixed, though.  I can simply test
if x.is_number and x.is_real

Or, I discovered the attribute .is_comparable.  So it seems there are two reasonable options. 

1.  Test if x.is_number and x_is_real.  If so, create a sort key for number.

 Or,

2.  Test if x.is_comparable.  If so, create a sort key for number.

In either case, return default_sort_key if condition is not met.

You raise some concerns about .as_real_imag being slow.  So, if that is the case, then it could be that option 2 is slow, as .is_comparable is based on .as_real_imag.  However, I don't see what objections can be made to option 1, except that it does not cover cases like (I*exp_polar(I*pi/2)).  Are there cases where the standard default_sort_key works better than option 1?  To my naive understanding, it does everything that .default_sort_key does while preserving the ordering of real numbers.  (I'm not worried about how it sorts complex numbers.  I suppose it would take some more logic to get purely imaginary numbers sorted like their real counterparts.  But, to me, what is important is getting real numbers sorted as expected.)

For example, with option 1 written as


def customized_sort_key(item, order=None):
    try:
        x=sympify(item)
    except:
        pass
    else:
        if x.is_real and x.is_number:

            return  ((1, 0, 'Number'), (0, ()), (), x)
    return default_sort_key(item, order)

I get the following results:

n [4]: l2 = [1 - sqrt(2), 1, pi, 4, 1 + sqrt(2)]

In [5]: lx = [1 - sqrt(x), 1, pi, 4, 1 + sqrt(x)]

In [6]: l2I = map(lambda x: x + I, l2)

In [7]: lxI = map(lambda x: x + I, lx)

In [8]: sorted(l2, key=customized_sort_key)
Out[8]: [-sqrt(2) + 1, 1, 1 + sqrt(2), pi, 4]

In [9]: sorted(lx, key=customized_sort_key)
Out[9]: [1, pi, 4, -sqrt(x) + 1, sqrt(x) + 1]

In [10]: sorted(l2I, key=customized_sort_key)
Out[10]: [1 + I, 4 + I, pi + I, 1 + sqrt(2) + I, -sqrt(2) + 1 + I]

In [11]: sorted(lxI, key=customized_sort_key)
Out[11]: [1 + I, 4 + I, pi + I, -sqrt(x) + 1 + I, sqrt(x) + 1 + I]


I guess the one objection I can see to this solution is that it is not applied recursively.  It doesn't use the same logic, for example, on numbers within Tuples or for coefficients of terms with symbols.  But, if one just does solution 1 recursively, then it shouldn't be slower than the default, should it?

Thanks,
Duane

Aaron Meurer

unread,
Oct 20, 2014, 5:01:10 PM10/20/14
to sy...@googlegroups.com
On Mon, Oct 20, 2014 at 9:24 AM, Duane Nykamp <dqny...@comcast.net> wrote:
> Thanks for the tip about making a Symbol real. That is easily fixed,
> though. I can simply test
> if x.is_number and x.is_real
>
> Or, I discovered the attribute .is_comparable. So it seems there are two
> reasonable options.
>
> 1. Test if x.is_number and x_is_real. If so, create a sort key for number.
>
> Or,
>
> 2. Test if x.is_comparable. If so, create a sort key for number.
>
> In either case, return default_sort_key if condition is not met.
>
> You raise some concerns about .as_real_imag being slow. So, if that is the

It is potentially slow, because it has to do some computation.
Consider for instance ((1 + x)**200).as_real_imag().

Aaron Meurer
> https://groups.google.com/d/msgid/sympy/23e423c6-e221-40e8-a28f-1e326f5cffdc%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages