[PATCH] add merge_unique for List

9 views
Skip to first unread message

Qian Yun

unread,
Jul 1, 2026, 10:21:02 AM (23 hours ago) Jul 1
to fricas-devel
I added new functions "merge_unique" and its destructive
version "merge_unique!" in aggcat.spad, and basically
copied the implementation to list.spad for inline optimization.

This is quite a common idiom, once added, I can simplify
some of the existing code with it.

- Qian
merge_unique.patch

Waldek Hebisch

unread,
Jul 1, 2026, 3:07:34 PM (18 hours ago) Jul 1
to fricas...@googlegroups.com
On Wed, Jul 01, 2026 at 10:20:19PM +0800, Qian Yun wrote:
> I added new functions "merge_unique" and its destructive
> version "merge_unique!" in aggcat.spad, and basically
> copied the implementation to list.spad for inline optimization.

I do not see why you have:

eq?(p, q) => error "cannot merge a list into itself"

AFAICS in such case

eq?(p, q) => p

should be fine. Actually, this looks like speed optimization,
since in the case of two lists that are equal but not 'eq?' the
code takes general branch and general part should also work
in 'eq?' case.

Also I would write 'cons(p1, r)' instead of 'concat(p1, r)'.
'concat' is overloaded and sometimes (not in this case)
overloading resolution takes different version. So, I prefer
to consistently use 'cons' in such case

> This is quite a common idiom, once added, I can simplify
> some of the existing code with it.

Yes

> diff --git a/src/algebra/aggcat.spad b/src/algebra/aggcat.spad
> index f4342d11..266f9212 100644
> --- a/src/algebra/aggcat.spad
> +++ b/src/algebra/aggcat.spad
> @@ -1822,6 +1822,14 @@ LinearAggregate(S : Type) : Category ==
> ++ trim(u, x) returns a copy of u with all occurrences of x deleted
> ++ from right and left ends.
> ++ For example, \spad{trim(" abc ", char " ")} returns \spad{"abc"}.
> + merge_unique : ((S, S) -> Boolean, %, %) -> %
> + ++ merge_unique(p, a, b) returns an aggregate c which merges a and b
> + ++ uniquely. p is a binary predicate and aggregate a and b is assumed
> + ++ to be ordered under p and without duplicates. The result
> + ++ aggregate c is also ordered under p, and contains all elements
> + ++ from a and b, also without duplicates.
> + ++ For example, \spad{merge_unique(<, [1, 3, 5], [3, 5, 6])} returns
> + ++ \spad{[1, 3, 5, 6]}.
> if S has Hashable then Hashable
> if S has Comparable then Comparable
> if S has OrderedSet then
> @@ -2227,6 +2235,9 @@ ExtensibleLinearAggregate(S : Type) : Category == Join(LinearAggregate S, _
> ++ remove!(x, u) destructively removes all values x from u.
> removeDuplicates! : % -> %
> ++ removeDuplicates!(u) destructively removes duplicates from u.
> + merge_unique! : ((S, S) -> Boolean, %, %) -> %
> + ++ merge_unique!(p, u, v) destructively merges u and v uniquely
> + ++ using predicate p.
> if S has OrderedSet then merge! : (%, %) -> %
> ++ merge!(u, v) destructively merges u and v in ascending order.
> add
> @@ -2417,6 +2428,47 @@ ListAggregate(S : Type) : Category == Join(StreamAggregate S,
> x := rest x
> r
>
> + merge_unique(f, p, q) ==
> + empty? p => q
> + empty? q => p
> + eq?(p, q) => error "cannot merge a list into itself"
> + if f(first p, first q)
> + then (r := list first p; p := rest p)
> + else (r := list first q; q := rest q)
> + t1 := first r
> + while not empty? p and not empty? q repeat
> + p1 := first p; q1 := first q
> + p1 = t1 => p := rest p
> + q1 = t1 => q := rest q
> + if f(p1, q1)
> + then (r := concat(p1, r); p := rest p; t1 := p1)
> + else (r := concat(q1, r); q := rest q; t1 := q1)
> + if empty? p then p := q
> + empty? p => reverse! r
> + if first p = t1 then p := rest p
> + concat!(reverse! r, p)
> +
> + merge_unique!(f, p, q) ==
> + empty? p => q
> + empty? q => p
> + eq?(p, q) => error "cannot merge a list into itself"
> + if f(first p, first q)
> + then (r := t := p; p := rest p)
> + else (r := t := q; q := rest q)
> + t1 := first t
> + while not empty? p and not empty? q repeat
> + p1 := first p; q1 := first q
> + p1 = t1 => p := rest p
> + q1 = t1 => q := rest q
> + if f(p1, q1)
> + then (setrest!(t, p); t := p; p := rest p; t1 := p1)
> + else (setrest!(t, q); t := q; q := rest q; t1 := q1)
> + if empty? p then p := q
> + empty? p => r
> + if first p = t1 then p := rest p
> + setrest!(t, p)
> + r
> +
> new(n, s) ==
> l := empty()
> for k in 1..n repeat l := concat(s, l)
> diff --git a/src/algebra/list.spad b/src/algebra/list.spad
> index 176dd4a6..34a8b1d1 100644
> --- a/src/algebra/list.spad
> +++ b/src/algebra/list.spad
> @@ -163,6 +163,47 @@ List(S : Type) : Exports == Implementation where
> if s = Qfirst x then return true else x := Qrest x
> false
>
> + merge_unique(f, p, q) ==
> + empty? p => q
> + empty? q => p
> + eq?(p, q) => error "cannot merge a list into itself"
> + if f(Qfirst p, Qfirst q)
> + then (r := list Qfirst p; p := Qrest p)
> + else (r := list Qfirst q; q := Qrest q)
> + t1 : S := Qfirst r
> + while not empty? p and not empty? q repeat
> + p1 : S := Qfirst p; q1 : S := Qfirst q
> + p1 = t1 => p := Qrest p
> + q1 = t1 => q := Qrest q
> + if f(p1, q1)
> + then (r := concat(p1, r); p := Qrest p; t1 := p1)
> + else (r := concat(q1, r); q := Qrest q; t1 := q1)
> + if empty? p then p := q
> + empty? p => reverse! r
> + if Qfirst p = t1 then p := Qrest p
> + concat!(reverse! r, p)
> +
> + merge_unique!(f, p, q) ==
> + empty? p => q
> + empty? q => p
> + eq?(p, q) => error "cannot merge a list into itself"
> + if f(Qfirst p, Qfirst q)
> + then (r := t := p; p := Qrest p)
> + else (r := t := q; q := Qrest q)
> + t1 : S := Qfirst t
> + while not empty? p and not empty? q repeat
> + p1 : S := Qfirst p; q1 : S := Qfirst q
> + p1 = t1 => p := Qrest p
> + q1 = t1 => q := Qrest q
> + if f(p1, q1)
> + then (qsetrest!(t, p); t := p; p := Qrest p; t1 := p1)
> + else (qsetrest!(t, q); t := q; q := Qrest q; t1 := q1)
> + if empty? p then p := q
> + empty? p => r
> + if Qfirst p = t1 then p := Qrest p
> + qsetrest!(t, p)
> + r
> +
> if S has Hashable then
>
> hashUpdate!(s : HashState, x : %) : HashState ==


--
Waldek Hebisch

Qian Yun

unread,
Jul 1, 2026, 6:00:58 PM (16 hours ago) Jul 1
to fricas...@googlegroups.com
On 7/2/26 3:07 AM, Waldek Hebisch wrote:
> On Wed, Jul 01, 2026 at 10:20:19PM +0800, Qian Yun wrote:
>> I added new functions "merge_unique" and its destructive
>> version "merge_unique!" in aggcat.spad, and basically
>> copied the implementation to list.spad for inline optimization.
>
> I do not see why you have:
>
> eq?(p, q) => error "cannot merge a list into itself"
>
> AFAICS in such case
>
> eq?(p, q) => p
>
> should be fine. Actually, this looks like speed optimization,
> since in the case of two lists that are equal but not 'eq?' the
> code takes general branch and general part should also work
> in 'eq?' case.
>

Yes, that line is copied from "merge", which is not needed
in "merge_unique".

> Also I would write 'cons(p1, r)' instead of 'concat(p1, r)'.
> 'concat' is overloaded and sometimes (not in this case)
> overloading resolution takes different version. So, I prefer
> to consistently use 'cons' in such case

I can use 'cons' in List. I used 'concat' because 'cons'
is not available in LSAGG.

- Qian

Waldek Hebisch

unread,
Jul 1, 2026, 7:27:40 PM (14 hours ago) Jul 1
to fricas...@googlegroups.com
On Thu, Jul 02, 2026 at 06:00:53AM +0800, Qian Yun wrote:
> On 7/2/26 3:07 AM, Waldek Hebisch wrote:
> > On Wed, Jul 01, 2026 at 10:20:19PM +0800, Qian Yun wrote:
> >> I added new functions "merge_unique" and its destructive
> >> version "merge_unique!" in aggcat.spad, and basically
> >> copied the implementation to list.spad for inline optimization.
> >
> > I do not see why you have:
> >
> > eq?(p, q) => error "cannot merge a list into itself"
> >
> > AFAICS in such case
> >
> > eq?(p, q) => p
> >
> > should be fine. Actually, this looks like speed optimization,
> > since in the case of two lists that are equal but not 'eq?' the
> > code takes general branch and general part should also work
> > in 'eq?' case.
> >
>
> Yes, that line is copied from "merge", which is not needed
> in "merge_unique".

I see, it makes some sense in 'merge!', but AFAICS is too weak
for correctness. Namely, for 'merge!' needed condition is that
the lists have no common node (but in general this would be
quite expensive to check, probably defeating efficiency gain
from 'merge!').

AFAICS 'merge_unique!' does not need such a condition.

> > Also I would write 'cons(p1, r)' instead of 'concat(p1, r)'.
> > 'concat' is overloaded and sometimes (not in this case)
> > overloading resolution takes different version. So, I prefer
> > to consistently use 'cons' in such case
>
> I can use 'cons' in List. I used 'concat' because 'cons'
> is not available in LSAGG.

OK, so in ListAggregate we need 'concat'.

Looking more, 'merge_unique' is exported in LinearAggregate, but
implementd only in ListAggregate. So it would be unimplemented
in array domains. So we need implementation in
OneDimensionalArrayAggregate (a simple one based on 'merge' would
do). Another way to go is to export 'merge_unique' only in
ListAggregate.

--
Waldek Hebisch

Qian Yun

unread,
Jul 1, 2026, 7:50:52 PM (14 hours ago) Jul 1
to fricas...@googlegroups.com
On 7/2/26 7:27 AM, Waldek Hebisch wrote:
> On Thu, Jul 02, 2026 at 06:00:53AM +0800, Qian Yun wrote:
>> On 7/2/26 3:07 AM, Waldek Hebisch wrote:
>>> On Wed, Jul 01, 2026 at 10:20:19PM +0800, Qian Yun wrote:
>>>> I added new functions "merge_unique" and its destructive
>>>> version "merge_unique!" in aggcat.spad, and basically
>>>> copied the implementation to list.spad for inline optimization.
>>>
>>> I do not see why you have:
>>>
>>> eq?(p, q) => error "cannot merge a list into itself"
>>>
>>> AFAICS in such case
>>>
>>> eq?(p, q) => p
>>>
>>> should be fine. Actually, this looks like speed optimization,
>>> since in the case of two lists that are equal but not 'eq?' the
>>> code takes general branch and general part should also work
>>> in 'eq?' case.
>>>
>>
>> Yes, that line is copied from "merge", which is not needed
>> in "merge_unique".
>
> I see, it makes some sense in 'merge!', but AFAICS is too weak
> for correctness. Namely, for 'merge!' needed condition is that
> the lists have no common node (but in general this would be
> quite expensive to check, probably defeating efficiency gain
> from 'merge!').
>

Yes, it's user's responsibility to make sure arguments to 'merge!'
do not share cons cells.

> AFAICS 'merge_unique!' does not need such a condition.
>

For 'merge_unique!' it can serve as optimization.

>>> Also I would write 'cons(p1, r)' instead of 'concat(p1, r)'.
>>> 'concat' is overloaded and sometimes (not in this case)
>>> overloading resolution takes different version. So, I prefer
>>> to consistently use 'cons' in such case
>>
>> I can use 'cons' in List. I used 'concat' because 'cons'
>> is not available in LSAGG.
>
> OK, so in ListAggregate we need 'concat'.
>
> Looking more, 'merge_unique' is exported in LinearAggregate, but
> implementd only in ListAggregate. So it would be unimplemented
> in array domains. So we need implementation in
> OneDimensionalArrayAggregate (a simple one based on 'merge' would
> do). Another way to go is to export 'merge_unique' only in
> ListAggregate.
>

For arrays, it requires an extra copy because the result size
might be smaller than (#a + #b) if there's common elements in a and b.

But such copy is not needed in FlexibleArray.

I can add the generic implementation in A1AGG. And leave the optimized
implementation for the future when the need arises.

- Qian

Waldek Hebisch

unread,
Jul 1, 2026, 8:00:41 PM (14 hours ago) Jul 1
to fricas...@googlegroups.com
Yes, it is important to have working version. Trying to optimize
array version would be premature.

--
Waldek Hebisch

Qian Yun

unread,
Jul 1, 2026, 11:40:35 PM (10 hours ago) Jul 1
to fricas...@googlegroups.com
I've added default implementation for A1AGG.

During the process, I find that we can utilize the fact
that the argument does not contain duplicates,
so the implementation for List can also be simplified a bit.

- Qian
improve-merge_unique.patch

Qian Yun

unread,
Jul 1, 2026, 11:49:08 PM (10 hours ago) Jul 1
to fricas...@googlegroups.com
Oops, missed the part for LSAGG.

Please see this new version.

- Qian
improve-merge_unique-v2.patch

Waldek Hebisch

unread,
6:48 AM (3 hours ago) 6:48 AM
to fricas...@googlegroups.com
On Thu, Jul 02, 2026 at 11:49:03AM +0800, Qian Yun wrote:
> Oops, missed the part for LSAGG.
>
> Please see this new version.

Looks OK, please commit.

> On 7/2/26 11:40 AM, Qian Yun wrote:
> > I've added default implementation for A1AGG.
> >
> > During the process, I find that we can utilize the fact
> > that the argument does not contain duplicates,
> > so the implementation for List can also be simplified a bit.
> >
> > - Qian
>
> --
> You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fricas-devel...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/fricas-devel/8e5da9fb-6b98-4f13-9171-c2f50078c2e3%40gmail.com.


--
Waldek Hebisch
Reply all
Reply to author
Forward
0 new messages