adding missing hash functions to Expression and related domains

70 views
Skip to first unread message

Bill Page

unread,
Mar 23, 2015, 10:41:01 AM3/23/15
to fricas-devel
The following patches provide hashUpdate! and as a result hash
functions to the Expression, Kernel, Float and Complex domains. Also
included is a patch to XHashTable to optionally pass a user specified
function representing equality in the key domain. This is important
for the use of XHashTable in domains like Expression where equality is
not canonical.

Please review and let me know if I can commit.

wspage@opensuse:~/fricas-ker> svn diff > ../exprhash.patch
wspage@opensuse:~/fricas-ker> cat ../exprhash.patch
Index: src/algebra/expr.spad
===================================================================
--- src/algebra/expr.spad (revision 1893)
+++ src/algebra/expr.spad (working copy)
@@ -147,7 +147,10 @@
x : % = y : % == (x - y) =$Rep 0$Rep
numer x == numer(x)$Rep
denom x == denom(x)$Rep
-
+ hashUpdate!(s : HashState, x : %) : HashState ==
+ s := hashUpdate!(s, numer x)
+ s := hashUpdate!(s, denom x)
+
EREP := Record(num : MP, den : MP)

coerce(p : MP) : % == [p, 1]$EREP pretend %
Index: src/algebra/float.spad
===================================================================
--- src/algebra/float.spad (revision 1893)
+++ src/algebra/float.spad (working copy)
@@ -185,6 +185,12 @@
if wholeObj then
OMputEndObject(dev)

+ hashUpdate!(s : HashState, x : %) : HashState ==
+ s := hashUpdate!(s, mantissa x)
+ s := hashUpdate!(s, exponent x)
+ s
+
+
shift2(x, y) == sign(x)*shift(sign(x)*x, y)

asin x ==
Index: src/algebra/gaussian.spad
===================================================================
--- src/algebra/gaussian.spad (revision 1893)
+++ src/algebra/gaussian.spad (working copy)
@@ -589,6 +589,12 @@
if wholeObj then
OMputEndObject(dev)

+ hashUpdate!(s : HashState, x : %) : HashState ==
+ s := hashUpdate!(s, real x)
+ s := hashUpdate!(s, imag x)
+ s
+
+
0 == [0, 0]
1 == [1, 0]
zero? x == zero?(x.real) and zero?(x.imag)
Index: src/algebra/kl.spad
===================================================================
--- src/algebra/kl.spad (revision 1893)
+++ src/algebra/kl.spad (working copy)
@@ -376,6 +376,13 @@
setPredicates(s, l)
o [convert x for x in argument(k)]$List(Pattern Float)

+ hashUpdate!(s : HashState, k : %) : HashState ==
+ s := hashUpdate!(s, name k)
+ s := hashUpdate!(s, height k)
+ for x in argument(k) repeat
+ s := hashUpdate!(s, x)
+ s
+
)abbrev package KERNEL2 KernelFunctions2
++ Description:
++ This package exports some auxiliary functions on kernels
Index: src/algebra/xhash.spad
===================================================================
--- src/algebra/xhash.spad (revision 1893)
+++ src/algebra/xhash.spad (working copy)
@@ -89,6 +89,7 @@
N ==> NonNegativeInteger
Z ==> Integer
I ==> SingleInteger
+B ==> Boolean

)if LiterateDoc
Let us now discuss the overall domain implementation.
@@ -102,7 +103,7 @@
++ Keywords: hash table
++ Description:
++ An implementation of a hash table that uses equality of the key domain
-++ to decide upon equality of keys.
+++ or a user specified function to decide upon equality of keys.
XHashTable(Key: SetCategory, Entry: Type):
Join(TableAggregate(Key, Entry), finiteAggregate, shallowlyMutable) with
table: (Key -> SingleInteger) -> %
@@ -111,6 +112,10 @@
++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
++ the case, k1 and k2 will internally be considered as being
++ different keys.
+ table: (Key -> SingleInteger,(Key,Key) -> Boolean) -> %
+ ++ table(h,eq) creates an empty hash table that uses eq instead
+ ++ of the "=" from the Key domain. Note that h and eq must be
+ ++ compatible in the sense that h(k1)~=h(k2) -> not eq(k1,k2)
== add
KE ==> Record(key: Key, entry: Entry)
UE ==> Union(Entry, "failed")
@@ -257,7 +262,8 @@
maxNumOfVirtualEntries: Z,_
idx: Z,_
arr: Buckets,_
- hashFunction: Key -> I)
+ hashFunction: Key -> I,_
+ eqFunction: (Key,Key) -> Boolean)

)if LiterateDoc
Note that for a value $r$ of type \texttt{Rep} it always holds:
@@ -334,7 +340,7 @@
stored and thus the loops eventually terminate.
)endif

- localSearch(a: Buckets, k: Key, h: Key -> I): Z ==
+ localSearch(a: Buckets, k: Key, h: Key -> I, eq:(Key,Key) -> B): Z ==
update!(p, mk) ==>
p := p + h2
if p>=n then p := p-n
@@ -348,12 +354,12 @@
deletedPosition?: Boolean := false
while not vacant? mk repeat
deleted? mk => (deletedPosition? := true; break)
- k = toKey mk => return p -- key found
+ eq(k,toKey mk) => return p -- key found
update!(p, mk)
q := p -- first position of a free entry
-- We ignore DELETED entries when looking for the key.
while not vacant? mk repeat
- not deleted? mk and k = toKey mk =>
+ not deleted? mk and eq(k,toKey mk) =>
setKeyEntry!(a, n, q, k, getEntry(a, n, p))
deleteKeyEntry!(a, n, p)
return q -- entry has been copied to previous DELETED position
@@ -386,6 +392,7 @@
m: N := arrayLengths ix
r: Rep := rep x
h: Key -> I := r.hashFunction
+ eq: (Key,Key) -> B := r.eqFunction
a: Buckets := r.arr
n: Z := numOfBuckets a
c: Buckets := newArr m
@@ -393,7 +400,7 @@
k: Key := toKey mk
-- Note that k is not in c, and there are no DELETED positions.
-- Thus, -m<=p<0.
- p := m + localSearch(c, k, h)
+ p := m + localSearch(c, k, h, eq)
setKeyEntry!(c, m, p, k, getEntry(a, n, i))
r.arr := c --destructively set new array
r.idx := ix
@@ -433,11 +440,15 @@
function into the representation.
)endif

- table(hashfunction: Key -> SingleInteger): % ==
+ table(hashfunction: Key -> SingleInteger,eqfunction: (Key,Key) ->
Boolean): % ==
n: N := arrayLengths 0
maxEntries: Z := maxLoad n
maxVirtualEntries: Z := maxVirtualLoad n
- per [0, maxEntries, 0, maxVirtualEntries, 0, newArr n, hashfunction]
+ hashfunction := forceLazySlot((hash$Key)@(Key -> I))$Lisp
pretend (Key->I)
+ per [0, maxEntries, 0, maxVirtualEntries, 0, newArr n,
hashfunction, eqfunction]
+ table(hashfunction: Key -> SingleInteger): % ==
+ table(hashfunction,forceLazySlot((_=$Key)@((Key,Key) ->
B))$Lisp pretend ((Key,Key)->B))
+
empty(): % ==
table(forceLazySlot((hash$Key)@(Key -> I))$Lisp pretend (Key->I))

@@ -465,19 +476,22 @@
search(k: Key, x: %): Union(Entry, "failed") ==
a: Buckets := rep(x).arr
h: Key -> I := rep(x).hashFunction
- p: Z := localSearch(a, k, h)
+ eq: (Key,Key) -> B := rep(x).eqFunction
+ p: Z := localSearch(a, k, h, eq)
p<0 => "failed"
getEntry(a, numOfBuckets a, p)::UE
elt(x: %, k: Key): Entry ==
a: Buckets := rep(x).arr
h: Key -> I := rep(x).hashFunction
- p := localSearch(a, k, h)
+ eq: (Key,Key) -> B := rep(x).eqFunction
+ p := localSearch(a, k, h, eq)
p<0 => error "key not in table"
getEntry(a, numOfBuckets a, p)
elt(x: %, k: Key, e: Entry): Entry ==
a: Buckets := rep(x).arr
h: Key -> I := rep(x).hashFunction
- p := localSearch(a, k, h)
+ eq: (Key,Key) -> B := rep(x).eqFunction
+ p := localSearch(a, k, h, eq)
p<0 => e
getEntry(a, numOfBuckets a, p)

@@ -500,7 +514,8 @@
if rep(x).numOfEntries >= rep(x).maxNumOfEntries then grow! x
a: Buckets := rep(x).arr
h: Key -> I := rep(x).hashFunction
- p := localSearch(a, k, h)
+ eq: (Key,Key) -> B := rep(x).eqFunction
+ p := localSearch(a, k, h, eq)
n: Z := numOfBuckets a
p>=0 => setEntry!(a, n, p, e)
r: Rep := rep x
@@ -527,7 +542,8 @@
remove!(k: Key, x: %): Union(Entry, "failed") ==
a: Buckets := rep(x).arr
h: Key -> I := rep(x).hashFunction
- p := localSearch(a, k, h)
+ eq: (Key,Key) -> B := rep(x).eqFunction
+ p := localSearch(a, k, h, eq)
p<0 => "failed" -- key has not been found
n: Z := numOfBuckets a
e: Entry := getEntry(a, n, p) -- to be returned
@@ -546,7 +562,7 @@
r: Rep := rep x
per [r.numOfEntries, r.maxNumOfEntries,_
r.numOfDeletedEntries, r.maxNumOfVirtualEntries,_
- r.idx, copy(r.arr), r.hashFunction]
+ r.idx, copy(r.arr), r.hashFunction, r.eqFunction]

)if LiterateDoc
\subsubsection{Setting common values}
@@ -621,8 +637,9 @@
xa := rep(x).arr; xn := numOfBuckets xa
ya := rep(y).arr; yn := numOfBuckets ya
h := rep(y).hashFunction
+ eq := rep(y).eqFunction
for i in 0..xn - 1 | key?(mk: MKey := getMKey(xa, i)) repeat
- p := localSearch(ya, toKey mk, h)
+ p := localSearch(ya, toKey mk, h, eq)
p < 0 => return false
getEntry(xa, xn, i) ~= getEntry(ya, yn, p) => return false
true
wspage@opensuse:~/fricas-ker>
exprhash.patch

Ralf Hemmecke

unread,
Mar 23, 2015, 6:27:25 PM3/23/15
to fricas...@googlegroups.com
On 03/23/2015 03:41 PM, Bill Page wrote:
> The following patches provide hashUpdate! and as a result hash
> functions to the Expression, Kernel, Float and Complex domains.

Looks OK to me. And if these domains export SetCategory (which is true)
there should be an implementation of hashUpdate!.

> Also included is a patch to XHashTable to optionally pass a user
> specified function representing equality in the key domain. This is
> important for the use of XHashTable in domains like Expression where
> equality is not canonical.

I see what you are aiming at, but I'm not sure whether I should support
your particular patch. Anyway, hashUpdate! has nothing to do with
XHashTable. So if your stuff is accepted, it should be two separate commits.

However, if you modify the code to allow an eqFunction in a table
construction, then I see no reason to require the Key parameter
http://fricas.github.io/api/XHashTable
to be of type SetCategory. The export of

table: (Key -> I) -> %

could be made conditional "if Key has SetCategory then ...".

In fact, it seems you would like to construct

E ==> Expression Integer
T ==> XHashTable(E, ENTRY)

and then create the table with

t: T := table(hash$E, eq)

but which function eq (other than =$E) would make sense in that place?

If you use =$E and have two expressions e1 and e2 (with e1=e2) that have
different representation, then storing some entries with keys e1 and e2
will (most probably) give 2 table entries since hash(e1) will be
different from hash(e2). Your specification says that for that case you
would simply consider e1 and e2 not equal with respect to the given eq
function. So that would also give two table entries.

This sounds like it would make more sense creating a domain
ExpressionRep(X: ...) that exports the equality from the underlying
representation and then define

Expression(X: ...): ... == ExpressionRep(X) add
Rep := ExpressionRep(X)
((x: %) = (y: %)): Boolean == (x - y) =$Rep 0

Then you are able to use

XHashTable(ExpressionRep(X), ENTRY)

and use =$ExpressionRep(X).

I'm somehow missing some concrete use case to find you patch to
XHashTable useful.

Ralf

Bill Page

unread,
Mar 23, 2015, 9:27:03 PM3/23/15
to fricas-devel
On 23 March 2015 at 18:27, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> On 03/23/2015 03:41 PM, Bill Page wrote:
>> The following patches provide hashUpdate! and as a result hash
>> functions to the Expression, Kernel, Float and Complex domains.
>
> Looks OK to me. And if these domains export SetCategory (which is true)
> there should be an implementation of hashUpdate!.
>

Yes, these functions were expected but the implementation was missing.

>> Also included is a patch to XHashTable to optionally pass a user
>> specified function representing equality in the key domain. This is
>> important for the use of XHashTable in domains like Expression where
>> equality is not canonical.
>
> I see what you are aiming at, but I'm not sure whether I should support
> your particular patch. Anyway, hashUpdate! has nothing to do with
> XHashTable. So if your stuff is accepted, it should be two separate commits.
>

Well, I didn't see any immediate point of implementing these missing
functions until I wanted to use them with XHashTable and I did not
want to use the equality provided by these domains.

> However, if you modify the code to allow an eqFunction in a table
> construction, then I see no reason to require the Key parameter
> http://fricas.github.io/api/XHashTable
> to be of type SetCategory.

I suppose so.

> The export of
>
> table: (Key -> I) -> %
>
> could be made conditional "if Key has SetCategory then ...".
>

OK.

> In fact, it seems you would like to construct
>
> E ==> Expression Integer
> T ==> XHashTable(E, ENTRY)
>
> and then create the table with
>
> t: T := table(hash$E, eq)
>
> but which function eq (other than =$E) would make sense in that place?
>

Yes. Typically I would want to use "lexical" equality in Expression, i.e.

(x,y) +-> not (smaller?(x,y) or smaller?(y,x))

> If you use =$E and have two expressions e1 and e2 (with e1=e2) that have
> different representation, then storing some entries with keys e1 and e2
> will (most probably) give 2 table entries since hash(e1) will be
> different from hash(e2). Your specification says that for that case you
> would simply consider e1 and e2 not equal with respect to the given eq
> function. So that would also give two table entries.
>

I want to be able to provide one value of ENTRY for a given E and
another value of ENTRY for another lexically/syntactically different E
even if these two expressions are equal in the sense of =$E. It seems
to me that if I do not provide a compatible equality then this can not
be guaranteed. Am I wrong?

> This sounds like it would make more sense creating a domain
> ExpressionRep(X: ...) that exports the equality from the underlying
> representation and then define
>
> Expression(X: ...): ... == ExpressionRep(X) add
> Rep := ExpressionRep(X)
> ((x: %) = (y: %)): Boolean == (x - y) =$Rep 0
>
> Then you are able to use
>
> XHashTable(ExpressionRep(X), ENTRY)
>
> and use =$ExpressionRep(X).
>

??? I do not want to change the definition of the existing domain
Expression. Maybe you meant I could use a domain such as

LexicalExpression(R:Comparable): FunctionSpace(R) join ... ==
Expression(R) add
((x: %) = (y: %)):Boolean == not (smaller?(x,y) or smaller?(y,x))

Yes, perhaps that could be done but seemed less sensible to me.

> I'm somehow missing some concrete use case to find you patch to
> XHashTable useful.
>

E.g. Caching in the Kernel domain.

Regards,
Bill Page.

Waldek Hebisch

unread,
Mar 24, 2015, 9:35:44 AM3/24/15
to fricas...@googlegroups.com
There is a serious problem with this patch. Namely, your
'hashUpdate!' only depends on representation. This means if
we have two expressions 'e1' and 'e2' such that 'e1 = e2' (according
to equality in Expression), but 'rep(e1) ~= rep(e2)' then it
may happen that 'hash(e1) ~= hash(e2)'. However, correct
implementation of 'hash' must satisfy 'hash(e1) = hash(e2)' if
'e1 = e2'. Due to this requirement finding correct hash function
for domains with noncanonical representation is quite tricky.
Expression adds additional difficulties due to dynamic nature.

I admit that current situation where at category level we claim
there here is hash function, but a lot of domains lack implementaion
is unsatisfactry. But it is not clear what to do. We could
introduce new category, say 'Hashable' and make sure that
domains which export 'Hashable' also implement 'hash'. Or we
could weaken requirement on 'hash' and say that equality
of representations is enough to have equality of hashes.
However, weaker requirement on 'hash' would seriously limit
its usefulness, basicaly to cases when one works with
representaion. Since we want to hide representation, this
is serious limitation.

We could easily add 'hashUpdate!' to canonical domains, like Polynomial
or SAE. Also we could add 'hashUpdate!' to Set (but implementation
is tricky). But Expression and Float pose serious difficulties.
Similarely, power series and streams are problematic (they are
supposed to be lazy but hash function has to access elements).

Extra comment: current handling of kernels is unsound so why
I oppose adding code which is "no worse" than existing one?
The point is that I would like to eventiually remove
unsoundness. If we allow new unsoundness to get in then
there is almost no chance of removing unsoundness: new
one is likely to get in faster than old is removed.
So I keep old unsoundness for pragmatic reasons: replacing
it by sound code is a lot of work and takes time and
without unsound parts we would loose important functionality.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Ralf Hemmecke

unread,
Mar 24, 2015, 10:10:50 AM3/24/15
to fricas...@googlegroups.com
On 03/24/2015 02:35 PM, Waldek Hebisch wrote:
> I admit that current situation where at category level we claim
> there here is hash function, but a lot of domains lack implementaion
> is unsatisfactry. But it is not clear what to do. We could
> introduce new category, say 'Hashable' and make sure that
> domains which export 'Hashable' also implement 'hash'.

My first reaction to this is, yes. There should be Hashable. That would
be much better than putting it into SetCategory. And... it sounds like a
pretty easy modification. And it would remove some unusable
"unimplemented" functions like claiming a hashfunction for Expression
(i.e. some unsoundness). Hashable then basically says that the domain
has a canonical representation. And Hashable would have to have
equality. So there would also be no need for modifying XHashTable.

> However, weaker requirement on 'hash' would seriously limit
> its usefulness, basicaly to cases when one works with
> representaion. Since we want to hide representation, this
> is serious limitation.

That goes into the same direction that I was proposing with
ExpressionRep (which was called LexicalExpression by Bill).
LexicalExpression could export Hashable while Expression would not.
Maybe LexicalExpression could be made useful for caching... I'm not sure
and actually not really interested in this code at the moment.

> Similarely, power series and streams are problematic (they are
> supposed to be lazy but hash function has to access elements).

I don't think it makes sense to hash infinite structures.

> The point is that I would like to eventiually remove
> unsoundness. If we allow new unsoundness to get in then
> there is almost no chance of removing unsoundness ...

Good!!! I hope, some day Float no longer claims to be a Field.

Ralf

Bill Page

unread,
Mar 24, 2015, 2:07:03 PM3/24/15
to fricas-devel
Equality in Computer Algebra and Beyond
James H. Davenport, JSC Volume 34 Issue 4, October 2002

Equality is such a fundamental concept in mathematics that, in fact,
we seldom explore it in detail, and tend to regard it as trivial. When
it is shown to be non-trivial, we are often surprised. As is often the
case, the computerization of mathematical computation in computer
algebra systems on the one hand, and mathematical reasoning in theorem
provers on the other hand, forces us to explore the issue of equality
in greater detail.

http://dl.acm.org/citation.cfm?id=636429
http://www.calculemus.net/meetings/siena01/Papers/Davenport.pdf

On 24 March 2015 at 09:35, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> Bill Page wrote:
>>
>> The following patches provide hashUpdate! and as a result hash
>> functions to the Expression, Kernel, Float and Complex domains.
>> ...
>> wspage@opensuse:~/fricas-ker> svn diff > ../exprhash.patch
>> wspage@opensuse:~/fricas-ker> cat ../exprhash.patch
>> Index: src/algebra/expr.spad
>> ===================================================================
>> --- src/algebra/expr.spad (revision 1893)
>> +++ src/algebra/expr.spad (working copy)
>> @@ -147,7 +147,10 @@
>> x : % = y : % == (x - y) =Rep
>> numer x == numer(x)$Rep
>> denom x == denom(x)$Rep
>> -
>> + hashUpdate!(s : HashState, x : %) : HashState ==
>> + s := hashUpdate!(s, numer x)
>> + s := hashUpdate!(s, denom x)
>> +
>> EREP := Record(num : MP, den : MP)
>>
>> coerce(p : MP) : % == [p, 1]$EREP pretend %
>
> There is a serious problem with this patch. Namely, your
> 'hashUpdate!' only depends on representation. This means if
> we have two expressions 'e1' and 'e2' such that 'e1 = e2'
> (according to equality in Expression), but 'rep(e1) ~= rep(e2)'
> then it may happen that 'hash(e1) ~= hash(e2)'.

Yes, certainly I (mostly) agree with you, although my definition of
'hashUpdate!' does not depend on the representation as such since it
uses only exported operations. The representation could change
without changing this definition.

That is why I wrote:

>> Also
>> included is a patch to XHashTable to optionally pass a user specified
>> function representing equality in the key domain. This is important
>> for the use of XHashTable in domains like Expression where equality
>> is not canonical."
>> ...

> However, correct
> implementation of 'hash' must satisfy 'hash(e1) = hash(e2)' if
> 'e1 = e2'. Due to this requirement finding correct hash function
> for domains with noncanonical representation is quite tricky.
> Expression adds additional difficulties due to dynamic nature.
>

Sure, but Davenport 2002 advises that I should ask: "equal as what?"
To satisfy this that we need to be able to pass an additional
parameter to XHashTable.

> I admit that current situation where at category level we claim
> there here is hash function, but a lot of domains lack implementaion
> is unsatisfactry. But it is not clear what to do. We could
> introduce new category, say 'Hashable' and make sure that
> domains which export 'Hashable' also implement 'hash'. Or we
> could weaken requirement on 'hash' and say that equality
> of representations is enough to have equality of hashes.
> However, weaker requirement on 'hash' would seriously limit
> its usefulness, basicaly to cases when one works with
> representaion. Since we want to hide representation, this
> is serious limitation.
>

I think a better way to solve this is to realize that the
"unsoundness" is in the specification not the implementation. It seems
to me that 'hash' and 'hashUpdate!' should be exported by the category
'Comparable' without presuming any relationship with the
"mathematical" equality = that should actually be exported by
SetCategory rather than BasicType. Then we only need that

'hash(e1) = hash(e2)' if 'not (smaller?(e1,e2) or smaller?(e2,e1)'

Even better perhaps would be if Comparable also exported an operation
named 'equal' that had this relationship to 'smaller?'. The name
'equal' should remind one of EQUAL in Lisp (in particular IBM Lisp/VM,
ref. Davenport).

> We could easily add 'hashUpdate!' to canonical domains, like Polynomial
> or SAE. Also we could add 'hashUpdate!' to Set (but implementation
> is tricky). But Expression and Float pose serious difficulties.

The fact that there are "serious difficulties" should be an indication
that the specification is wrong. In fact as you know according to
Richardson's theorem it is impossible even in princple in Expression
since we have rational functions plus 'abs', and 'sin'. But the
concept of hashing is so important that we should not make it to
extraordinary things in order to use it in domains like Expression.

> Similarly, power series and streams are problematic (they are
> supposed to be lazy but hash function has to access elements).
>

For lazy infinite objects it is important to compare the
algorithms/functions that generate them rather than values. Since two
or more algorithms may generate the same sequence we have the same
situation as in Expression but that should not mean we cannot hash
them.

> Extra comment: current handling of kernels is unsound so why
> I oppose adding code which is "no worse" than existing one?
> The point is that I would like to eventually remove
> unsoundness. If we allow new unsoundness to get in then
> there is almost no chance of removing unsoundness: new
> one is likely to get in faster than old is removed.
> So I keep old unsoundness for pragmatic reasons: replacing
> it by sound code is a lot of work and takes time and
> without unsound parts we would lose important functionality.
>

In principle I agree and as you know I am also concerned about the
current handling of kernels. But I think the problem here not
unsoundness but rather the assumption that hash should necessarily be
related to *mathematical* equality.

Bill.

Bill Page

unread,
Mar 24, 2015, 2:23:40 PM3/24/15
to fricas-devel
On 24 March 2015 at 10:10, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> On 03/24/2015 02:35 PM, Waldek Hebisch wrote:
>> I admit that current situation where at category level we claim
>> there here is hash function, but a lot of domains lack implementaion
>> is unsatisfactry. But it is not clear what to do. We could
>> introduce new category, say 'Hashable' and make sure that
>> domains which export 'Hashable' also implement 'hash'.
>
> My first reaction to this is, yes. There should be Hashable. That would
> be much better than putting it into SetCategory. And... it sounds like a
> pretty easy modification. And it would remove some unusable
> "unimplemented" functions like claiming a hashfunction for Expression
> (i.e. some unsoundness).

I do not think that it is unsound to claim a hashfunction for
Expression. What is unsound is to claim that a hashfunction for
Expression must be related to the definition of = in Expression. Or
rather, as I said in my reply to Waldek: For hashing we need to know
when two things are equal "as what" i.e. in what sense. Two expression
can be equal in a mathematical sense or only in a syntactic sense. I
proposed that hashing should therefore be connected with whether or
not a domain is 'Comparable'.

> Hashable then basically says that the domain
> has a canonical representation. And Hashable would have to have
> equality. So there would also be no need for modifying XHashTable.
>

As far as I can see this has nothing to do with representation. I
agree that it makes sense to specify hashing only in the context of an
appropriate equality. Maybe it is true that this is best handled by
modifying the category structure rather than passing an additional
parameter to XHashTable, but the way I saw it was that adding the
additional parameter was not much worse than being able to pass a user
specified hash function and stating in the documentation what
conditions is should satisfy.

>> However, weaker requirement on 'hash' would seriously limit
>> its usefulness, basicaly to cases when one works with
>> representaion. Since we want to hide representation, this
>> is serious limitation.
>
> That goes into the same direction that I was proposing with
> ExpressionRep (which was called LexicalExpression by Bill).
> LexicalExpression could export Hashable while Expression would not.

Expression already exports Comparable, although not OrderedSet.

> Maybe LexicalExpression could be made useful for caching... I'm not sure
> and actually not really interested in this code at the moment.
>

It sounds oddly inappropriate that you comment that you are "not
really interested" :(

>> Similarely, power series and streams are problematic (they are
>> supposed to be lazy but hash function has to access elements).
>
> I don't think it makes sense to hash infinite structures.
>
>> The point is that I would like to eventiually remove
>> unsoundness. If we allow new unsoundness to get in then
>> there is almost no chance of removing unsoundness ...
>
> Good!!! I hope, some day Float no longer claims to be a Field.
>

Yes, I think this is a closely related problem involving different
possible definitions of equality.

Bill.

Waldek Hebisch

unread,
Mar 26, 2015, 5:59:22 PM3/26/15
to fricas...@googlegroups.com
Bill Page wrote:
>
> Equality in Computer Algebra and Beyond
> James H. Davenport, JSC Volume 34 Issue 4, October 2002
>
Well, Axiom philosophy (which probably is quite close to Davenport
philosophy) is to say "equal in given domain". That is, if you
want different equality you are supposed to create a different
domain.

Generaly, hash function and equalty should be compatible.
If you want non-default equality then IMO the only reasonable
way it to pass _both_ equality and hash function as parameter
to XHashTable. In such case it is user responsibility to
ensure that they are compatible. But default equality
and hash function if present should be compatible.

> I think a better way to solve this is to realize that the
> "unsoundness" is in the specification not the implementation. It seems
> to me that 'hash' and 'hashUpdate!' should be exported by the category
> 'Comparable' without presuming any relationship with the
> "mathematical" equality = that should actually be exported by
> SetCategory rather than BasicType. Then we only need that
>
> 'hash(e1) = hash(e2)' if 'not (smaller?(e1,e2) or smaller?(e2,e1)'

I do not understand why do you want to add Comparable to the mix?
If correctly implemented your test with 'smaller?' should give
the same result as '='. In the paper you mentioned Davenport
writes about system beeing congruential (or not). Clearly
congruential systems are nicer. So equality in a domain may
be a bit strange but still should obey usual axioms. And
my main objection is that your patch would make Expression
with operations restricted to '=' and 'hash' non-congruential.

In case of Expression AFAICS we always have

not(smaller?(x, y) or smaller?(y, x)) implies x = y

so any lack of equality means that Expression with '=' and 'smaller?'
is not congruential. As I wrote I think that this can happen but
consider such thing as a bug. ATM I do not have a testcase,
if I had I would consider this as high-priority bug.

Extra remark: my inpression after reading Davenport paper it
that he did not want to praise Axiom too much, but he thinks
that system should be organized into domains and _operation
from the domain should be compatible (congruential)_. Real
word can force us to use inferior equality (like MoteCarlo
tests), but we should introduce no gratitious incompatibilities.

> > We could easily add 'hashUpdate!' to canonical domains, like Polynomial
> > or SAE. Also we could add 'hashUpdate!' to Set (but implementation
> > is tricky). But Expression and Float pose serious difficulties.
>
> The fact that there are "serious difficulties" should be an indication
> that the specification is wrong. In fact as you know according to
> Richardson's theorem it is impossible even in princple in Expression
> since we have rational functions plus 'abs', and 'sin'. But the
> concept of hashing is so important that we should not make it to
> extraordinary things in order to use it in domains like Expression.

You are mixing two things: Richardson's theorem says that equality
of functions represented by expressions is not computable.
Practically this means that equality in Expression should be
conservative approximation to equality of functions. This
says nothing about possibility of building hash function
compatible with equality in Expression. Serious difficulty
is because we expect hash functions to be efficient.
Slow hash function can be implemented using association
list of arguments and values. We compute hash function
first searching association list via linear search. If
argument is there we return associated values. If not
we generate new values and append new pair to the
association list. Now, main use of hash functions is
to avoid linear search. Of course the method above
is useless for such purpose because we use linear
search to comput hash.

You write about changing specification. However, hash function
which do not agree with equality is useless for speading up
existing searches. So you need to change specification
of searches and the question is if you gain anything using
such modified searches. Even if there is gain required
changes are likely to be nontrivial and using changes search
without other changes is likely to lead to bugs. In
other words we have problem that types system is suppesed
to solve: current set of operations should go together and
your redefined operations should go togeter and do not
mix with original ones (except when data is explicitely
converted). I hope that you see now why using current
signatures has problems. Now domain or new name for operation
would resolve this. Of course, there is still question
if there are interesting use cases for redefined operations...

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Waldek Hebisch

unread,
Mar 26, 2015, 7:00:10 PM3/26/15
to fricas...@googlegroups.com
Actually, Float is messy due to variable precision. At given
precison floats are supposed to have specified number of
significant bits (by default 68). However, when precision
is increased then we produce floats which have more (say 200)
bits. When precision is decreased back to 68 bits the 200 bits values
are considered valid floats. Somewhat strange conseqence of
this is that if 200 bits floats is exactly representable
as 68 bit float, then exponent fields is larger than in typical
68 float while mantissa have a lot of trailing 0 bits. So
correct hash function for floats could be obtained by dropping
trailing 0 bits from representation and adjusting exponent.
This would effectively give canonical representation.
Another (equivalent) way it to convert to Fraction(Integer).

So correct equality (and hash) function on Float is not that
hard as my previous comment would suggest. But correct
_use_ of hash function on floats is different matter: due
to rounding results which would be equal in infinitely
precise arithmetic are likely to differ. So normal case
is that if we have copy of given number than it is equal
to original. But if we have numbers which came out from
different computations, then there is significant chance
that they will differ. So there is question how useful
hashing of floats. Note that practical search is likely
to use fuzzy comparizons which requires different implementation
than exact search.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Waldek Hebisch

unread,
Mar 26, 2015, 7:15:36 PM3/26/15
to fricas...@googlegroups.com
Bill Page wrote:
> Also
> included is a patch to XHashTable to optionally pass a user specified
> function representing equality in the key domain. This is important
> for the use of XHashTable in domains like Expression where equality is
> not canonical.
>
I only now looked more carefully at this part. Since we already
have hash function as argument passing also equality is IMO
quite reasonable. As was mentioned, if we get hash and equality
as argument we do not need SetCategory. So I think that
Key should be just Type and single argument 'table' should
be conditional on SetCategory. BTW: looking at the diff
I was unable to find implementation of one argument 'table'
(old one was changed to two arguments)...

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Bill Page

unread,
Mar 26, 2015, 10:17:34 PM3/26/15
to fricas-devel
On 26 March 2015 at 19:15, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> Bill Page wrote:
>> ...
>> XHashTable(Key: SetCategory, Entry: Type):
>> Join(TableAggregate(Key, Entry), finiteAggregate, shallowlyMutable) with
>> table: (Key -> SingleInteger) -> %
>> @@ -111,6 +112,10 @@
>> ++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
>> ++ the case, k1 and k2 will internally be considered as being
>> ++ different keys.
>> + table: (Key -> SingleInteger,(Key,Key) -> Boolean) -> %
>> + ++ table(h,eq) creates an empty hash table that uses eq instead
>> + ++ of the "=" from the Key domain. Note that h and eq must be
>> + ++ compatible in the sense that h(k1)~=h(k2) -> not eq(k1,k2)
>> == add
>> KE ==> Record(key: Key, entry: Entry)
>> UE ==> Union(Entry, "failed")
>
> I only now looked more carefully at this part. Since we already
> have hash function as argument passing also equality is IMO
> quite reasonable.

OK.

> As was mentioned, if we get hash and equality
> as argument we do not need SetCategory. So I think that
> Key should be just Type and single argument 'table' should
> be conditional on SetCategory.

OK.

> BTW: looking at the diff
> I was unable to find implementation of one argument 'table'
> (old one was changed to two arguments)...
>

Please refer to:

On 23 March 2015 at 10:41, Bill Page <bill...@newsynthesis.org> wrote:

> @@ -433,11 +440,15 @@
> function into the representation.
> )endif
>
> - table(hashfunction: Key -> SingleInteger): % ==
> + table(hashfunction: Key -> SingleInteger,eqfunction: (Key,Key) -> Boolean): % ==
> n: N := arrayLengths 0
> maxEntries: Z := maxLoad n
> maxVirtualEntries: Z := maxVirtualLoad n
> - per [0, maxEntries, 0, maxVirtualEntries, 0, newArr n, hashfunction]
> + hashfunction := forceLazySlot((hash$Key)@(Key -> I))$Lisp pretend (Key->I)
> + per [0, maxEntries, 0, maxVirtualEntries, 0, newArr n, hashfunction, eqfunction]
> + table(hashfunction: Key -> SingleInteger): % ==
> + table(hashfunction,forceLazySlot((_=$Key)@((Key,Key) -> B))$Lisp pretend ((Key,Key)->B))
> +
> empty(): % ==
> table(forceLazySlot((hash$Key)@(Key -> I))$Lisp pretend (Key->I))
>

Thanks.

Ralf Hemmecke

unread,
Mar 27, 2015, 3:30:53 AM3/27/15
to fricas...@googlegroups.com
> I only now looked more carefully at this part. Since we already
> have hash function as argument passing also equality is IMO
> quite reasonable. As was mentioned, if we get hash and equality
> as argument we do not need SetCategory. So I think that
> Key should be just Type and single argument 'table' should
> be conditional on SetCategory.

Well.. If we drop SetCategory from Key, then the documentation must be
rewritten in some places. I strongly oppose if just the new function is
committed without changing the appropriate places in the whole
documentation, i.e. the part inside )if LiterateDoc ... )endif.

I have created XHashTable simply because the original Table was not
taking care of =$Key and resorted to AssociationList. I haven't looked
now, but the documentation certainly has some places that refer to
equality from Key.

Furthermore, deviating from

if eq(a, b) then hash(a) = hash(b)

might yield unpredictable results. Maybe not for insertion and
searching, but it should be checked carefully what happens when

not eq(a,b) and hash(a) = hash(b)

when a or b is deleted from the table, because deleting a might result
in actually removing b from the table.

OK, if Bill requires

if not eq(a,b) then hash(a) ~= hash(b)

that's not an issue, but still. There should be a thourough revision.

If Bill then uses as eq for Expression that builds on smaller?, i.e.
Comparable, I also oppose. Comparable is there for some technical
reason. It's an arbitrary total order on a (maybe non-totally ordered)
domain. Technically, Bill's idea is probably OK, but here again it comes
to my mind to build Expression(X) as a wrapper domain of underlying
ExpressionRep domains that have equality according to their
representation, i.e. only equal (via =) if rep is equal. Doing this
would also avoid having to change XHashTable, btw. And I guess using =
from ExpressionRep would be faster than checking

not smaller?(x, y) and not smaller?(y, x)


> BTW: looking at the diff
> I was unable to find implementation of one argument 'table'
> (old one was changed to two arguments)...

The following links should answer your concern about implementation of
table().

https://github.com/hemmecke/fricas/blob/master-hemmecke/src/algebra//xhash.spad#L107
https://github.com/hemmecke/fricas/blob/master-hemmecke/src/algebra//aggcat.spad#L967

Ralf

Bill Page

unread,
Mar 31, 2015, 4:46:35 PM3/31/15
to fricas-devel
On 27 March 2015 at 03:30, Ralf Hemmecke <ra...@hemmecke.org> wrote:
>
> Well.. If we drop SetCategory from Key, then the documentation must be
> rewritten in some places. I strongly oppose if just the new function is
> committed without changing the appropriate places in the whole
> documentation, i.e. the part inside )if LiterateDoc ... )endif.

Of course I will update the documentation but not without first
complaining that the current situation with

)if LiterateDoc ... )endif

and elsewhere

)if LiterateDoc ... )endif

in about 10 other SPAD files seems like a mess to me. Code embedded
in a lot of text, i.e. noweb, was bad enough but now we have a lot of
text embedded in code and that seems even worse. Although I do
appreciate having the documentation available, but my first reaction
was to start by deleting everything between )if LiterateDoc ...
)endif so I could actually read the code in a reasonably concise form.

Do we still have a way to produce pdf or dvi files from these
"literate" source documents?

Now that we have removed the empty noweb boilerplate from most of the
SPAD files wouldn't it make sense to introduce a new directory, say
src/doc/algebra, and put the available documentation source files
there in an easily rendered LaTeX format? In other words we would have
both 'src/algebra/xhash.spad' and 'src/doc/algebra/xhash.tex'. Yes I
know that in order to keep exactly the same documentation content we
had before with noweb we would have to copy some or all of the code
into verbatim or listing sections in the LaTeX file, but to me that
would be greatly preferable to the way it is now.

>
> I have created XHashTable simply because the original Table was not
> taking care of =$Key and resorted to AssociationList. I haven't looked
> now, but the documentation certainly has some places that refer to
> equality from Key.
>

OK.

> Furthermore, deviating from
>
> if eq(a, b) then hash(a) = hash(b)
>
> might yield unpredictable results. Maybe not for insertion and
> searching, but it should be checked carefully what happens when
>
> not eq(a,b) and hash(a) = hash(b)
>
> when a or b is deleted from the table, because deleting a might result
> in actually removing b from the table.
>

I do not intend to deviate from

if eq(a, b) then hash(a) = hash(b)

I agree that that is an essential part of the definition of hash
function. In the existing XHashTable code if an incompatible hash
function is passed the almost immediate result is an array bounds
failure in the 'rehashAux!' routine due to an unexpected duplicate
key. A simple change in 'localSearch' to check both hash and eq, for
example:

hk := h k
h1: Z := hk::Z
...
hk = h toKey mk and eq(k,toKey mk) => return p -- key found
...
not deleted? mk and hk = h toKey mk and eq(k,toKey mk) =>

is sufficient to prevent this and maybe preferable since usually
calling hash is much cheaper than computing eq even when eq defaults
to =$Key. Since evaluation of 'and' is lazy I guess the mileage will
depend on the statistics of hits and misses.

> OK, if Bill requires
>
> if not eq(a,b) then hash(a) ~= hash(b)
>
> that's not an issue, but still. There should be a thorough revision.

I suppose that you meant to write: if Bill requires:

if hash(a) ~= hash(b) then not eq(a,b)

This is logically equivalent to what you already wrote in the
documentation but I wrote it this way to emphasize this requirement on
the new eq function argument.

>
> If Bill then uses as eq for Expression that builds on smaller?, i.e.
> Comparable, I also oppose. Comparable is there for some technical
> reason. It's an arbitrary total order on a (maybe non-totally ordered)
> domain.

Ralf, I think you really need to check the details here. At this level
all reasons are "technical".

> Technically, Bill's idea is probably OK, but here again it comes
> to my mind to build Expression(X) as a wrapper domain of underlying
> ExpressionRep domains that have equality according to their
> representation, i.e. only equal (via =) if rep is equal. Doing this
> would also avoid having to change XHashTable, btw. And I guess
> using = from ExpressionRep would be faster than checking
>
> not smaller?(x, y) and not smaller?(y, x)
>

Why do you expect = in Rep to be equivalent to

not smaller?(x, y) and not smaller?(y, x)

But I do agree that there are more efficient ways to compute this sort
of equality that calling 'smaller?' twice (in the worst case). I'll
say more about why I think 'Comparable' is important when I get a
chance to reply to Waldek but for now let's just agree that passing an
additional parameter to 'table()$XHashTable' is not so bad.

Bill.

Bill Page

unread,
Mar 31, 2015, 4:48:40 PM3/31/15
to fricas-devel
Oops, I meant to write:

On 31 March 2015 at 16:46, Bill Page <bill...@newsynthesis.org> wrote:
> ...
> Of course I will update the documentation but not without first
> complaining that the current situation with
>
> )if LiterateDoc ... )endif
>

and elsewhere as

)if false ... )endif

> in about 10 other SPAD files seems like a mess to me.
> ...

Waldek Hebisch

unread,
Mar 31, 2015, 6:21:23 PM3/31/15
to fricas...@googlegroups.com
1) Why do you expect 'hash' to be cheaper than 'eq'? In normal
case both 'hash' and 'eq' have to look at all components
of a value. For builtion types 'hash' have to perform some
computations which look more complicated than equality.
In important special cases equality is quite fast: if values
are different, then frequently we will notice difference
after looking only at few components. If values occupy the
same place in memory. then Lisp 'EQ' will very quickly tell
us that they are equal. For expression equality requires
computing difference, while simple hash would remove irrationalities
from denominators, which is much more involved.
2) What you write above looks like attempt at error hiding.
If essential precondition is violated, then more approprate
reaction is to signal error. In this case I would say that
doing extra computation to detect errors is probably not
worth the effort. Rather, detectiong out of bound array
index is enough. We may add some test to give better
error message, but it should be cheap (like comparing index
with bound).

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Bill Page

unread,
Apr 1, 2015, 12:38:22 AM4/1/15
to fricas-devel
On 31 March 2015 at 18:21, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
>
> 1) Why do you expect 'hash' to be cheaper than 'eq'? In normal
> case both 'hash' and 'eq' have to look at all components
> of a value.

If we look for A among many B_i we must look at all the components of
A and all the components of B_i for each B_i. If I compute the hash of
A by looking at all the components of A, I only need to do that once.

> For builtin types 'hash' have to perform some
> computations which look more complicated than equality.

Domains like SingleInteger and Symbol export hash (SXHASH) directly
from Lisp. But do any of the other built-in types export 'hash'? I
especially miss 'hash' in the case of Record.

> In important special cases equality is quite fast: if values
> are different, then frequently we will notice difference
> after looking only at few components.

Yes. As I said: YMMV. It is not unusual that the cost of computing ~=
is different than the cost of computing =.

> If values occupy the
> same place in memory. then Lisp 'EQ' will very quickly tell
> us that they are equal.

At least as long as we are not using a "quantum computer", if we
already know that two things are identical then of course it is fast
to check that they are equal.

> For expression equality requires
> computing difference, while simple hash would remove irrationalities
> from denominators, which is much more involved.

Expressing the difference of two rational functions as a rational
function involves at least the product of two polynomials but I admit
that I am not certain that it is even possible to define a compatible
hash function for expression equality defined in this way. Could you
explain what you mean by removing irrationalities from the
denominators?

Perhaps relevant to this discussion, I liked the following article:

Equality and hashing for (almost) free: Generating implementations
from abstraction functions
Derek Rayside, et al.
http://dspace.mit.edu/handle/1721.1/51695

Abstract:
In an object-oriented language such as Java, every class requires
implementations of two special methods, one for determining equality
and one for computing hash codes. Although the specification of these
methods is usually straightforward, they can be hard to code (due to
subclassing, delegation, cyclic references, and other factors) and
often harbor subtle faults. A technique is presented that simplifies
this task. Instead of writing code for the methods, the programmer
gives, as a brief annotation, an abstraction function that defines an
abstract view of an object's representation, and sometimes an
additional observer in the form of an iterator method. Equality and
hash codes are then computed in library code that uses reflection to
read the annotations. Experiments on a variety of programs suggest
that, in comparison to writing the methods by hand, our technique
requires less text from the programmer and results in methods that are
more often correct.

--

I am not sure how easily their proposed method could be applied to
FriCAS, but I think their definition of the problem is important.

> 2) What you write above looks like attempt at error hiding.
> If essential precondition is violated, then more approprate
> reaction is to signal error.

Yes perhaps but in 'xhash.spad' Ralf wrote:

table: (Key -> SingleInteger) -> %
++ table(h) creates an empty hash table that uses h instead of
++ hash$Key. Note that h should be a mathematical function in
++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
++ the case, k1 and k2 will internally be considered as being
++ different keys.

My suggested change would only do what he claims.

> In this case I would say that
> doing extra computation to detect errors is probably not
> worth the effort. Rather, detecting out of bound array
> index is enough.

It was very confusing to me until I realized that the documentation
was not quite accurate.

> We may add some test to give better
> error message, but it should be cheap (like comparing index
> with bound).

Yes, that is quite easy:

Bill Page

unread,
Apr 1, 2015, 12:43:42 AM4/1/15
to fricas-devel
Oops, I meant to add this code:

On 1 April 2015 at 00:38, Bill Page <bill...@newsynthesis.org> wrote:
> ...
>> We may add some test to give better
>> error message, but it should be cheap (like comparing index
>> with bound).
>
> Yes, that is quite easy:

rehashAux!(x: %, ix: Z): % ==
...
k: Key := toKey mk
-- Note that k is not in c, and there are no DELETED positions.
-- Thus, -m<=p<0.
- p := m + localSearch(c, k, h)
- setKeyEntry!(c, m, p, k, getEntry(a, n, i))
+ q := localSearch(c, k, h, eq)
+ if q<0 then
+ p := m + q
+ setKeyEntry!(c, m, p, k, getEntry(a, n, i))
+ else -- should not happen!
+ error "XHashTable rehashAux!: duplicate key"
r.arr := c --destructively set new array
r.idx := ix
r.maxNumOfEntries := maxLoad m

--

Waldek Hebisch

unread,
Apr 1, 2015, 2:26:36 AM4/1/15
to fricas...@googlegroups.com
>
> On 31 March 2015 at 18:21, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> >
> > 1) Why do you expect 'hash' to be cheaper than 'eq'? In normal
> > case both 'hash' and 'eq' have to look at all components
> > of a value.
>
> If we look for A among many B_i we must look at all the components of
> A and all the components of B_i for each B_i. If I compute the hash of
> A by looking at all the components of A, I only need to do that once.

Normal case of equality a = b is:

for all components c
if c(a) ~= c(b) then return false
true

Normal case for hash(a) is:

for all components c
s := hashUpdate(s, c(a))

In both cases you need to go trough components of a and b, but
in case of equality there is possibility of early exit. In
case of sets we do all-by all comparison of elements, but
that is quite unusual.
For example transforming:

1/sqrt(x^2 + 1) into sqrt(x^2 + 1)/(x^2 + 1)

This is trivial case, but in general all algebraics (in particular roots)
can be removed from denominator to numerator, so that denominator
is purely transcendental. If you also normalize numerator in some
way (for example by forcing leading coefficient to 1) then
for given set of independent kernels you get canononical form.

>
> Perhaps relevant to this discussion, I liked the following article:
>
> Equality and hashing for (almost) free: Generating implementations
> from abstraction functions
> Derek Rayside, et al.
> http://dspace.mit.edu/handle/1721.1/51695
>
> Abstract:
> In an object-oriented language such as Java, every class requires
> implementations of two special methods, one for determining equality
> and one for computing hash codes.

> I am not sure how easily their proposed method could be applied to
> FriCAS, but I think their definition of the problem is important.

I will look at it, but usualy in such setting authors mean data
structure equality, which is not entirely trivial but has no
problems which appear in math.

>
> > 2) What you write above looks like attempt at error hiding.
> > If essential precondition is violated, then more approprate
> > reaction is to signal error.
>
> Yes perhaps but in 'xhash.spad' Ralf wrote:
>
> table: (Key -> SingleInteger) -> %
> ++ table(h) creates an empty hash table that uses h instead of
> ++ hash$Key. Note that h should be a mathematical function in
> ++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
> ++ the case, k1 and k2 will internally be considered as being
> ++ different keys.
>
> My suggested change would only do what he claims.

Ralf should say if he cares about this specification. AFAICS
currently even if we ignore problem with rehash this does
not work. Namely consider case when eq(k1, k2) holds but h(k1)~= h(k2).
Still, k1 and k2 may map to the same slot (because we reduce hashes
to smaller range, so after reduction values may match). Without
extra test we will store one of k-s and ignore the second.

To fix this basically in any place you use 'eq' we would have use
your variant. We could say that k1 and k2 _may_ be treated
as different and modify rehash to ignore (drop) duplicates.
But then you may inseret the same key case several times:

you insert k1
you insert k2 bacause it maps to different slot
rehash: k1 and k2 map to the same slot, drop one say k1
reshash: k1 and k2 map to different slots
you insert k1
rehash: k1 and k2 map to the same slot, drop one say k2
reshash: k1 and k2 map to different slots
you insert k2
...

That looks crazy for me, so while it is easy to specify
and probably to implement I do not think we want this.

If treating k1 and k1 is really desirable, then custom
eq, for examples

eq(x, y) ==
x = y and hash(x) = hash(y)

looks like reliable way. This has advantage that normal
case is not burdened with extra computations -- we do
extra work only when needed.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Ralf Hemmecke

unread,
Apr 1, 2015, 11:08:05 AM4/1/15
to fricas...@googlegroups.com
On 03/31/2015 10:46 PM, Bill Page wrote:
> Of course I will update the documentation but not without first
> complaining that the current situation with
>
> )if LiterateDoc ... )endif

[snip]

> in about 10 other SPAD files seems like a mess to me.

Yes, it is. Waldek translated some noweb stuff into )if false ...
)endif. And I thought for my stuff, I rather like a more telling markup.
It's not perfect, but it's sufficient for my literate documentation and
doesn't conflict with Waldek's wish to have .spad instead of
.spad.pamphlet files.

There are also two files in my master-hemmecke branch:

https://github.com/hemmecke/fricas/blob/master-hemmecke/src/doc/literatedoc.sty

https://github.com/hemmecke/fricas/blob/master-hemmecke/src/doc/literatedoc.awk

That gives sufficiently readable output for code that follows the )if
LiterateDoc style.

> Code embedded in a lot of text, i.e. noweb, was bad enough but now we
> have a lot of text embedded in code and that seems even worse.

Sorry, but I still think that writing literate programs is way better
than just writing code and pointing people to documents that are perhaps
inaccessible to the reader. How would you have understood how XHashTable
works if I hadn't included all the background information and design
decisions? And yes, separating code from documentation makes it easier
to have them diverge.

> Although I do appreciate having the documentation available, but my
> first reaction was to start by deleting everything between )if
> LiterateDoc ... )endif so I could actually read the code in a
> reasonably concise form.

That can be done by either an appropriate editor (would be nice if
someone convinces emacs to fold such parts) or a simple one-line-sed-script.

> Do we still have a way to produce pdf or dvi files from these
> "literate" source documents?

See above.

Ralf

Ralf Hemmecke

unread,
Apr 1, 2015, 5:28:44 PM4/1/15
to fricas...@googlegroups.com
On 04/01/2015 12:21 AM, Waldek Hebisch wrote:
> Bill Page wrote:
>> I agree that that is an essential part of the definition of hash
>> function. In the existing XHashTable code if an incompatible hash
>> function is passed the almost immediate result is an array bounds
>> failure in the 'rehashAux!' routine due to an unexpected duplicate
>> key. A simple change in 'localSearch' to check both hash and eq, for
>> example:
>>
>> hk := h k
>> h1: Z := hk::Z
>> ...
>> hk = h toKey mk and eq(k,toKey mk) => return p -- key found
>> ...
>> not deleted? mk and hk = h toKey mk and eq(k,toKey mk) =>
>>
>> is sufficient to prevent this and maybe preferable since usually
>> calling hash is much cheaper than computing eq even when eq defaults
>> to =$Key. Since evaluation of 'and' is lazy I guess the mileage will
>> depend on the statistics of hits and misses.

> 2) What you write above looks like attempt at error hiding.
> If essential precondition is violated, then more approprate
> reaction is to signal error. In this case I would say that
> doing extra computation to detect errors is probably not
> worth the effort. Rather, detectiong out of bound array
> index is enough. We may add some test to give better
> error message, but it should be cheap (like comparing index
> with bound).

Well, in fact, Bill found a (small) bug.

table: (Key -> SingleInteger) -> %
++ table(h) creates an empty hash table that uses h instead of
++ hash$Key. Note that h should be a mathematical function in
++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
++ the case, k1 and k2 will internally be considered as being
++ different keys.

The last sentence is nothing for the specification. So I require

if k1=k2 then h(k1) = h(k2).

And since that is precondition in the specification, anything can happen
if a user of that function does not meet that condition. In other words,
I don't see need to add extra checking code.

I'm slowly checking whether I can convince myself to allow eq instead of
=$Key. Might take a few more days.

Ralf

Bill Page

unread,
Apr 2, 2015, 1:37:16 PM4/2/15
to fricas-devel
On 1 April 2015 at 17:28, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> ..
> table: (Key -> SingleInteger) -> %
> ++ table(h) creates an empty hash table that uses h instead of
> ++ hash$Key. Note that h should be a mathematical function in
> ++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
> ++ the case, k1 and k2 will internally be considered as being
> ++ different keys.
>
> The last sentence is nothing for the specification. So I require
>
> if k1=k2 then h(k1) = h(k2).
>
> And since that is precondition in the specification, anything can happen
> if a user of that function does not meet that condition. In other words,
> I don't see need to add extra checking code.
>

So in your view the last sentence should be deleted (since it is not
true) and the possibility of an obscure error message sometime during
the use of XHashTable is sufficient in case the hash function passed
by the user does not meet the precondition. Right?

Bill.

Bill Page

unread,
Apr 2, 2015, 2:09:21 PM4/2/15
to fricas-devel
On 1 April 2015 at 11:08, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> On 03/31/2015 10:46 PM, Bill Page wrote:
>> Code embedded in a lot of text, i.e. noweb, was bad enough but now we
>> have a lot of text embedded in code and that seems even worse.
>
> Sorry, but I still think that writing literate programs is way better
> than just writing code and pointing people to documents that are perhaps
> inaccessible to the reader.

Why would these documents be inaccessible to the reader? If one is
already editing a file named

src/algebra/xhash.spad

it is not hard to check for the existence of an associated file

src/doc/algebra/xhash.tex

(and or after installation, the same file in pdf or dvi format in an
appropriate place).

Even better: Why not provide a link to such documentation (if it
exists) while browsing in Hyperdoc?

> How would you have understood how XHashTable
> works if I hadn't included all the background information and design
> decisions?

As I said, I do appreciate reading the background information and
design decisions provided in your documentation, I just don't
appreciate having it mixed up with the code itself.

> And yes, separating code from documentation makes it easier
> to have them diverge.
>

Sure. But as I see it the main consequence of the fact that the
original Axiom open source project was fixated on Lisp and infatuated
with literate programming is that basically after 15 years the
developer documentation at the SPAD library level for Axiom or its
forked projects remains woefully inadequate. Only a small fraction of
the original code or even newly written code benefitted from a focus
on this style of documentation.

>> Although I do appreciate having the documentation available, but my
>> first reaction was to start by deleting everything between )if
>> LiterateDoc ... )endif so I could actually read the code in a
>> reasonably concise form.
>
> That can be done by either an appropriate editor (would be nice if
> someone convinces emacs to fold such parts) or a simple one-line-sed-script.
>

Of course stripping the documentation in this case was easy (but might
not be so easy where )if false was present in the source code for
other reasons). Concerning editors, I suppose that is part of my
point: There are no good/standard tools for editing such mixed up
source code and documentation. And more to the point, even if there
were such tools it is doubtful that everyone who wants to contribute
to FriCAS would agree to use them.

>> Do we still have a way to produce pdf or dvi files from these
>> "literate" source documents?
>
> See above.
>

Thanks. I think it would be nice if this was done during 'make' (or at
least as an option of make) and the resulting pdf or dvi files were
linked in Hyperdoc.

Bill.

Bill Page

unread,
Apr 2, 2015, 4:27:39 PM4/2/15
to fricas-devel
On 1 April 2015 at 02:26, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
>>
>> On 31 March 2015 at 18:21, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
>> ...
>> > For expression equality requires
>> > computing difference, while simple hash would remove irrationalities
>> > from denominators, which is much more involved.
>>
>> Expressing the difference of two rational functions as a rational
>> function involves at least the product of two polynomials but I admit
>> that I am not certain that it is even possible to define a compatible
>> hash function for expression equality defined in this way. Could you
>> explain what you mean by removing irrationalities from the
>> denominators?
>
> For example transforming:
>
> 1/sqrt(x^2 + 1) into sqrt(x^2 + 1)/(x^2 + 1)
>
> This is trivial case, but in general all algebraics (in particular roots)
> can be removed from denominator to numerator, so that denominator
> is purely transcendental. If you also normalize numerator in some
> way (for example by forcing leading coefficient to 1) then
> for given set of independent kernels you get canononical form.
>

I think that such transformations do not respect all possible
definitions of equality, but your claim then is that it is possible to
get a canonical form for all expressions in the Expression domain
relative to whatever is meant by equality ( =$Expression ) in this
domain? Noting for example that

1/sqrt(x^2 + 1) - sqrt(x^2 + 1)/(x^2 + 1)

does in fact already evaluate to 0.

> ...
>>
>> > 2) What you write above looks like attempt at error hiding.
>> > If essential precondition is violated, then more approprate
>> > reaction is to signal error.
>>
>> Yes perhaps but in 'xhash.spad' Ralf wrote:
>>
>> table: (Key -> SingleInteger) -> %
>> ++ table(h) creates an empty hash table that uses h instead of
>> ++ hash$Key. Note that h should be a mathematical function in
>> ++ the sense that from k1=k2 follows h(k1)=h(k2). If that is not
>> ++ the case, k1 and k2 will internally be considered as being
>> ++ different keys.
>>
>> My suggested change would only do what he claims.
>
> Ralf should say if he cares about this specification.

Apparently his answer is that he does not care. :(

> AFAICS
> currently even if we ignore problem with rehash this does
> not work. Namely consider case when eq(k1, k2) holds but h(k1)~= h(k2).

That is exactly the case I was considering.

> Still, k1 and k2 may map to the same slot (because we reduce hashes
> to smaller range, so after reduction values may match). Without
> extra test we will store one of k-s and ignore the second.
>

Yes, but not ignore. It is important to remember what happens to the
entry part of the (key,entry) pair in this case.

> To fix this basically in any place you use 'eq' we would have use
> your variant.

Yes. There are only two places in the code where eq is used. I
mentioned both places in my email.

> We could say that k1 and k2 _may_ be treated
> as different and modify rehash to ignore (drop) duplicates.

No. In that case the table behaves unpredictably after rehash because
corresponding entries are dropped.

> But then you may inseret the same key case several times:
>
> you insert k1
> you insert k2 because it maps to different slot
> rehash: k1 and k2 map to the same slot, drop one say k1
> reshash: k1 and k2 map to different slots
> you insert k1
> rehash: k1 and k2 map to the same slot, drop one say k2
> reshash: k1 and k2 map to different slots
> you insert k2
> ...
>
> That looks crazy for me, so while it is easy to specify
> and probably to implement I do not think we want this.

I agree that we do not want this.

>
> If treating k1 and k1 is really desirable, then custom
> eq, for example
>
> eq(x, y) ==
> x = y and hash(x) = hash(y)
>
> looks like reliable way. This has advantage that normal
> case is not burdened with extra computations

Are you saying that if no argument is passed, we can assume (hope?)
that the FriCAS designers have already guaranteed that the hash$Key
and =$Key are compatible, so then eq just defaults to =$Key. But if we
want a custom hash and/or equality, then we would locally define eq as
above?

My proposal optimized the (re-)calculation of hash(x). This seems a
bit harder to do if we want it only in the latter case. At minimum I
guess we need to test a flag which could then be moved outside the
while loops.

> -- we do extra work only when needed.
>

What you wrote actually does extra work if hash is passed: It
recomputes both hash(x) and hash(y) every time eq is evaluated. The
specific way eq (or =$Key) is used in XHashTable makes it possible to
compute hash(x) for the target key x just once, then hash(y) and
eq(x,y) for each y in some set of y's already in the table.

Bill.

Waldek Hebisch

unread,
Apr 2, 2015, 8:19:52 PM4/2/15
to fricas...@googlegroups.com
I specifically wrote "given set of independent kernels". Just
now I do not want to think what happens if you give input which
violates implicit assumptions about independence of kernels
made in Expression. Equality in expression computes difference
of two expressions treating them as rational functions of kernes.
Then it reduces powers of algebraic kernels takes into account
their defining relations. If after that you get 0, than
expression are deemed equal, otherwise unequal. If there
are no dependent algebraics this is just equality in apropriate
algebraic extension of field of rational functions. For such
fields a standard fact is that a given function is equal to
one without algebraics. The equality you noted is just
an example, after all canonical form of x is supposed to
be equal to x.

Note: if you introduce dependent kernels most of the times
you get sensible results. But you may also get nonsense
regardless of any problems due to hashing.

> > If treating k1 and k1 is really desirable, then custom
> > eq, for example
> >
> > eq(x, y) ==
> > x = y and hash(x) = hash(y)
> >
> > looks like reliable way. This has advantage that normal
> > case is not burdened with extra computations
>
> Are you saying that if no argument is passed, we can assume (hope?)
> that the FriCAS designers have already guaranteed that the hash$Key
> and =$Key are compatible, so then eq just defaults to =$Key. But if we
> want a custom hash and/or equality, then we would locally define eq as
> above?

When it comes to hash _we_ are FriCAS designers. In Axiom code we
inherited hashing was mostly unimplemented, but implemented part
did not guarante anything. I tried to make sure that hash gets
implemented in way compatible to equality in domans. In
particular I removed default implementation via SXHASH which
violated this condition and I left unimplemented cases where
simple implementation _could_ violate compatibility of
equality and hash.

And I think that only sane use of hash tables is when hash
and equality are compatible. Users of hash tables should
insure this, possibly by modifying equality like above.

>
> My proposal optimized the (re-)calculation of hash(x). This seems a
> bit harder to do if we want it only in the latter case. At minimum I
> guess we need to test a flag which could then be moved outside the
> while loops.
>
> > -- we do extra work only when needed.
> >
>
> What you wrote actually does extra work if hash is passed: It
> recomputes both hash(x) and hash(y) every time eq is evaluated. The
> specific way eq (or =$Key) is used in XHashTable makes it possible to
> compute hash(x) for the target key x just once, then hash(y) and
> eq(x,y) for each y in some set of y's already in the table.

Right, in this specific case your implementation is more
efficient. But I do not think that we should optimize
for such use case -- it looks quite unusual.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Ralf Hemmecke

unread,
Apr 4, 2015, 2:27:42 AM4/4/15
to fricas...@googlegroups.com
On 03/27/2015 12:15 AM, Waldek Hebisch wrote:
> As was mentioned, if we get hash and equality
> as argument we do not need SetCategory. So I think that
> Key should be just Type and single argument 'table' should
> be conditional on SetCategory.

If we do this, then there are some more places to be considered.

TableAggregate(Key, Entry) currently requires Key: SetCategory. That is
probably because it inherits from KeyedDictionary and IndexedAggregate.

I would find it a little odd, if XHashTable cannot inherit from
TableAggregate if Key is just of type Type.

I don't think that it is technically a big problem, but would like to
get some opinions.

Ralf

Waldek Hebisch

unread,
Apr 4, 2015, 10:00:05 AM4/4/15
to fricas...@googlegroups.com
Ralf Hemmecke wrote:
>
> On 03/27/2015 12:15 AM, Waldek Hebisch wrote:
> > As was mentioned, if we get hash and equality
> > as argument we do not need SetCategory. So I think that
> > Key should be just Type and single argument 'table' should
> > be conditional on SetCategory.
>
> If we do this, then there are some more places to be considered.
>
> TableAggregate(Key, Entry) currently requires Key: SetCategory. That is
> probably because it inherits from KeyedDictionary and IndexedAggregate.
>
> I would find it a little odd, if XHashTable cannot inherit from
> TableAggregate if Key is just of type Type.

I think that categories should make only minimal requirements on
parameters. So it is natural to generalize TableAggregate to
require only Type on first parameter. Some operations may be
conditional, for example 'removeDuplicates' assumes notion of
equality, so it make sense to require BasicType.

Specific domains should make suitable restrictions on parameters,
depending on what they need.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Ralf Hemmecke

unread,
Apr 4, 2015, 10:18:21 AM4/4/15
to fricas...@googlegroups.com
On 04/04/2015 04:00 PM, Waldek Hebisch wrote:
> I think that categories should make only minimal requirements on
> parameters. So it is natural to generalize TableAggregate to
> require only Type on first parameter. Some operations may be
> conditional, for example 'removeDuplicates' assumes notion of
> equality, so it make sense to require BasicType.

OK.

> Specific domains should make suitable restrictions on parameters,
> depending on what they need.

So the first natural thing is to generalize

Eltable(S : SetCategory, Index : Type) : Category == with

to

Eltable(S : BasicType, Index : Type) : Category == with

https://github.com/hemmecke/fricas/blob/master-hemmecke/src/algebra//aggcat.spad#L797

Ralf

Waldek Hebisch

unread,
Apr 4, 2015, 6:20:08 PM4/4/15
to fricas...@googlegroups.com
Ralf Hemmecke wrote:
>
> On 04/04/2015 04:00 PM, Waldek Hebisch wrote:
> > I think that categories should make only minimal requirements on
> > parameters. So it is natural to generalize TableAggregate to
> > require only Type on first parameter. Some operations may be
> > conditional, for example 'removeDuplicates' assumes notion of
> > equality, so it make sense to require BasicType.
>
> OK.
>
> > Specific domains should make suitable restrictions on parameters,
> > depending on what they need.
>
> So the first natural thing is to generalize
>
> Eltable(S : SetCategory, Index : Type) : Category == with
>
> to
>
> Eltable(S : BasicType, Index : Type) : Category == with
>

Hmm, why not:

Eltable(S : Type, Index : Type) : Category == with

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Bill Page

unread,
Apr 5, 2015, 1:02:20 AM4/5/15
to fricas-devel
On 4 April 2015 at 18:19, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> ...
>> On 04/04/2015 04:00 PM, Waldek Hebisch wrote:
>> > I think that categories should make only minimal requirements on
>> > parameters. So it is natural to generalize TableAggregate to
>> > require only Type on first parameter. Some operations may be
>> > conditional, for example 'removeDuplicates' assumes notion of
>> > equality, so it make sense to require BasicType.
>>

It seems unclear to me what you might mean by "minimal requirements".
When is the minimal requirement not just "none at all". If there are
no requirements then we can always add them by making everything
conditional. No?

> Ralf Hemmecke wrote:
>> So the first natural thing is to generalize
>>
>> Eltable(S : SetCategory, Index : Type) : Category == with
>>
>> to
>>
>> Eltable(S : BasicType, Index : Type) : Category == with
>>
>
> Hmm, why not:
>
> Eltable(S : Type, Index : Type) : Category == with
>

Reading the definition:

++ An eltable over domains D and I is a structure which can be viewed
++ as a function from D to I.
++ Examples of eltable structures range from data structures, e.g. those
++ of type \spadtype{List}, to algebraic structures, e.g. \spadtype{Polynomial}.
Eltable(S : SetCategory, Index : Type) : Category == with
elt : (%, S) -> Index
++ elt(u, i) (also written: u . i) returns the element of u indexed by i.
++ Error: if i is not an index of u.

Do you notice anything strange about the documentation of this
category? S is apparently the "domain" (D) and 'Index' is the
"co-domain" (I) of the function represented by % referred to
initially. So the name 'Index' seems like a very poor choice!
Presumably a better name might have been 'Image' similar to 'Im' which
is used elsewhere in reference to Eltable. It is actually the domain
S that seems like an index to me and that is how it is treated in the
definition of 'elt'.

In any case, if a domain D does not have some concept of equality =
(BasicType) does it make sense to talk about "a function from D to I"?

Are there any domains in FriCAS that one might conceivably use as an
index domain which does not export equality?

I guess I don't understand the desire to "generalize TableAggregate to
require only Type on first parameter.". Is this just generalization
for it's own sake or is there some conceivable use? Backing this up, I
agree with Ralf that I also "would find it a little odd, if XHashTable
cannot inherit from TableAggregate if Key is just of type Type". From
my point of view this is a good reason to keep the requirement that
Key has SetCategory even though I want to be able to pass my own
custom equality when creating an XHashTable. Otherwise this would
push the support for passing a custom equality all the way down to
other domains which have TableAggregate. Do we really want to do that?

Bill Page

unread,
Apr 5, 2015, 1:37:38 AM4/5/15
to fricas-devel
After reviewing all of the discussion in this thread concerning
XHashTable, I find that I now agree with Ralf's original response.
Perhaps it is not a good design decision to pass a custom equality
when creating an XHashTable. In fact, I would go further and claim
that this also applies to the passing of a custom hash function which
is permitted in the current implementation of XHashTable. This
"extra" feature of XHashTable should be removed.

It seemed overly ridged at first to insist on enforcing the contract
between the = and hash functions at the category level because this
forces the creation of a new domain in the case I had in mind, i.e.
using XHashTable in Kernel instead of SortedCache. But now I begin to
see this as the best option. Perhaps what I want is something we
might call "InnerKernel" which exports a much simpler (cheaper) kind
of equality and an associated hash function. Then I would use
InnerKernel together with XHashTable in the Kernel domain itself to
implement a more "mathematical" equality. And now I also think Ralf
and Waldek were right to claim that really this has more to do with
representation than I was initially willing to admit.

That said, I really do want to get back to the original subject of
this thread: adding missing hash functions to Expression and to those
domains on which Expression is dependent.

Ralf Hemmecke

unread,
Apr 5, 2015, 4:49:35 AM4/5/15
to fricas...@googlegroups.com
On 04/05/2015 12:19 AM, Waldek Hebisch wrote:
> Hmm, why not:
>
> Eltable(S : Type, Index : Type) : Category == with

In fact, as I continued to generalize TableAggregate, I came to the same
conclusion, even though I had (like Bill) some conceptual problems with
S not having equality.

See top commit here:

https://github.com/hemmecke/fricas/commits/generalize-tableaggregate

I don't know whether it's the best possible, but at least, it compiles.
Whether we still want to have it after Bill now seems to follow my first
suggestion of having something like InnerKernel, is another question.

In general I agree with Waldek that categories shouldn't have too many
requirements for their arguments. And as for Eltable(A,B), yes,
mathematically it sounds a bit odd that one can have a function

elt: (%, A) -> B

even though one doesn't know whether the elements of A can be
distinguished via equality or not. But on the other hand, look at a
monoid M. It can be considered as a category with just one object X,
where the morphisms are the elements of M. Bill knows certainly enough
category theory to see that there is no need for equality on X to know
what the arrow

m: X -> X (m \in M)

is.

Ralf

Ralf Hemmecke

unread,
Apr 5, 2015, 5:03:40 AM4/5/15
to fricas...@googlegroups.com
On 04/05/2015 07:02 AM, Bill Page wrote:
>> Eltable(S : Type, Index : Type) : Category == with

> Do you notice anything strange about the documentation of this
> category? S is apparently the "domain" (D) and 'Index' is the
> "co-domain" (I) of the function represented by % referred to
> initially. So the name 'Index' seems like a very poor choice!

True. And I think, we should change the arguments to (Dom, Im) as they
are in EltableAggregate.

> Are there any domains in FriCAS that one might conceivably use as an
> index domain which does not export equality?

I guess not. It's even hard to find a domain in FriCAS that is not of
type BasicType.

In fact, in the long run, I would like to remove non-constructive
functions. Or at least adapt their specification. Equality of streams,
for example, is not computable in general. So Stream(X) should not
inherit from BasicType. We can replace that with another function

eq: (%, %) -> Union(Boolean, "failed")

whose specification explicitly says that eq(x,y) might run forever or
return "failed", if the respective algorithm isn't able to decide about
equality. But that's for the future since it needs quite some thought
about what other part of FriCAS it affects.

Ralf

Ralf Hemmecke

unread,
Apr 5, 2015, 5:24:32 AM4/5/15
to fricas...@googlegroups.com
On 04/05/2015 07:37 AM, Bill Page wrote:
> After reviewing all of the discussion in this thread concerning
> XHashTable, I find that I now agree with Ralf's original response.
> Perhaps it is not a good design decision to pass a custom equality
> when creating an XHashTable.

Oh... and I've already started trying to generalize XHashTable. :-( ;-)

> In fact, I would go further and claim that this also applies to the
> passing of a custom hash function which is permitted in the current
> implementation of XHashTable. This "extra" feature of XHashTable
> should be removed.

Yes and no. However, even though I don't yet have a particular use case,
I could imagine that for certain situations it makes sense to create a
hash function which is tailored to the problem at hand. For example,
suppose you want create a hash table where the keys are strings that
alll start with the string "http://.../" where the ... stands for a
rather long URL. Of course, one can cut that part of the string, but one
could also use another hash function.

More importantly, in some cases the default hash function of the domain
would not yield equal distribution of the keys, since only a specific
part of the elements will ever appear as keys. Admittedly, creating such
a specific hash function would be an art, but that is the reason for
allowing a hash function as a parameter to table.

However, if Waldek agrees, I see currently no big problem in dropping
this "extra feature".

> It seemed overly ridged at first to insist on enforcing the contract
> between the = and hash functions at the category level because this
> forces the creation of a new domain in the case I had in mind, i.e.
> using XHashTable in Kernel instead of SortedCache. But now I begin
> to see this as the best option. Perhaps what I want is something we
> might call "InnerKernel" which exports a much simpler (cheaper) kind
> of equality and an associated hash function. Then I would use
> InnerKernel together with XHashTable in the Kernel domain itself to
> implement a more "mathematical" equality. And now I also think Ralf
> and Waldek were right to claim that really this has more to do with
> representation than I was initially willing to admit.

Yes, that's what I said about creating ExpressionRep. Well, you want
InnerKernel, but that goes probably in the same direction. Caching the
representation of two (differnently looking) kernels that are
mathematically equal, can be done with XHashTable if = means equality of
representation. That caches a few more kernels, but otherwise would be
sound wrt. XHashTable's requirements.

Would be somehow nices to "collaps"/"replace" two (mathematically) equal
kernels by a "dominant" one, if the two are recognized as being equal.

Bill, maybe you try to come up with a suggestion that introduces
InnerKernel.

Ralf

Waldek Hebisch

unread,
Apr 5, 2015, 11:50:06 AM4/5/15
to fricas...@googlegroups.com
Bill Page wrote:
>
> On 4 April 2015 at 18:19, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> > ...
> >> On 04/04/2015 04:00 PM, Waldek Hebisch wrote:
> >> > I think that categories should make only minimal requirements on
> >> > parameters. So it is natural to generalize TableAggregate to
> >> > require only Type on first parameter. Some operations may be
> >> > conditional, for example 'removeDuplicates' assumes notion of
> >> > equality, so it make sense to require BasicType.
> >>
>
> It seems unclear to me what you might mean by "minimal requirements".
> When is the minimal requirement not just "none at all". If there are
> no requirements then we can always add them by making everything
> conditional. No?

I think we normally want at least one unconditional signature.
The point is that we do not want to impose unnecessary conditions,
but also we do not want to play semantic games or go to
extraordinary effort to have weak (or no) conditions on parameter.

>
> > Ralf Hemmecke wrote:
> >> So the first natural thing is to generalize
> >>
> >> Eltable(S : SetCategory, Index : Type) : Category == with
> >>
> >> to
> >>
> >> Eltable(S : BasicType, Index : Type) : Category == with
> >>
> >
> > Hmm, why not:
> >
> > Eltable(S : Type, Index : Type) : Category == with
> >
>
> Reading the definition:
>
> ++ An eltable over domains D and I is a structure which can be viewed
> ++ as a function from D to I.
> ++ Examples of eltable structures range from data structures, e.g. those
> ++ of type \spadtype{List}, to algebraic structures, e.g. \spadtype{Polynomial}.
> Eltable(S : SetCategory, Index : Type) : Category == with
> elt : (%, S) -> Index
> ++ elt(u, i) (also written: u . i) returns the element of u indexed by i.
> ++ Error: if i is not an index of u.
>
> Do you notice anything strange about the documentation of this
> category? S is apparently the "domain" (D) and 'Index' is the
> "co-domain" (I) of the function represented by % referred to
> initially. So the name 'Index' seems like a very poor choice!
> Presumably a better name might have been 'Image' similar to 'Im' which
> is used elsewhere in reference to Eltable. It is actually the domain
> S that seems like an index to me and that is how it is treated in the
> definition of 'elt'.

Yes, good catch.

> In any case, if a domain D does not have some concept of equality =
> (BasicType) does it make sense to talk about "a function from D to I"?
>
> Are there any domains in FriCAS that one might conceivably use as an
> index domain which does not export equality?

Note that you wrote "a function". 'elt' is just a function, like
any other. So you ask if all domains should export equality?
If equality in not computable, than having no equality is
natural.

> I guess I don't understand the desire to "generalize TableAggregate to
> require only Type on first parameter.". Is this just generalization
> for it's own sake or is there some conceivable use? Backing this up, I
> agree with Ralf that I also "would find it a little odd, if XHashTable
> cannot inherit from TableAggregate if Key is just of type Type". From
> my point of view this is a good reason to keep the requirement that
> Key has SetCategory even though I want to be able to pass my own
> custom equality when creating an XHashTable. Otherwise this would
> push the support for passing a custom equality all the way down to
> other domains which have TableAggregate.

No. We are talking about _categories_ here. Any domain exporting
given category may impose stronger condition on parameters, and
all domains which need eqality in the parameter will still
require this. Modifying category means that domain like
XHashTable may have the same category like other domains,
without requiring any change in those domains. That is why
having weak conditions on parameters to categories make sense:
domain may always add extra restrictions to parameters, but
can not drop restricions introduced in the category.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Martin Baker

unread,
Apr 5, 2015, 12:32:32 PM4/5/15
to fricas...@googlegroups.com
I don't know if this is relevant to your discussion about kernels (most
of it goes over my head) but I thought I might mention something I would
like to implement, at some time in the future, in case there is some
requirement that is relevant?

Some time ago I implemented intuitionistic logic with a rep like this:

ILogic() : Exports == Implementation where
....
Rep := Union(_
const : Record(val : Symbol), _
var : Record(str : String), _
binaryOp : Record(typ : Symbol, c1 : %, c2 : %), _
unaryOp : Record(typ : Symbol, c1 : %)_
)

I would like to make this more general and allow many types of
propositional logic and associated algebras such as boolean, (co)Heyting
algebra and so on.

In OpenAxiom Gaby implemented PropositionalFormula using Kernel like this:

PropositionalFormula(T: SetCategory): Public == Private where
....
Rep == Union(T, Kernel %)

Would it be better if I used Kernel like this? I get the impression most
of the issues you are discussing are around casheing. I suspect there
would not be the same performance issues for discrete algebra like this?
I might even question if FriCAS/pan-axiom is the way to go for discrete
algebra? I suspect you are optimising for more numerical based algebra?

Apologies if I have misunderstood your discussion, my SPAD skills are
getting a bit rusty.

Martin

Waldek Hebisch

unread,
Apr 5, 2015, 5:44:03 PM4/5/15
to fricas...@googlegroups.com
>
> I don't know if this is relevant to your discussion about kernels (most
> of it goes over my head) but I thought I might mention something I would
> like to implement, at some time in the future, in case there is some
> requirement that is relevant?
>
> Some time ago I implemented intuitionistic logic with a rep like this:
>
> ILogic() : Exports == Implementation where
> ....
> Rep := Union(_
> const : Record(val : Symbol), _
> var : Record(str : String), _
> binaryOp : Record(typ : Symbol, c1 : %, c2 : %), _
> unaryOp : Record(typ : Symbol, c1 : %)_
> )
>
> I would like to make this more general and allow many types of
> propositional logic and associated algebras such as boolean, (co)Heyting
> algebra and so on.
>
> In OpenAxiom Gaby implemented PropositionalFormula using Kernel like this:
>
> PropositionalFormula(T: SetCategory): Public == Private where
> ....
> Rep == Union(T, Kernel %)
>
> Would it be better if I used Kernel like this?

Yes, this representation looks fine.

> I get the impression most
> of the issues you are discussing are around casheing. I suspect there
> would not be the same performance issues for discrete algebra like this?
> I might even question if FriCAS/pan-axiom is the way to go for discrete
> algebra? I suspect you are optimising for more numerical based algebra?

The discussion is more about equality and builtin simplifications.
We could use representation like above for Expression(Integer).
But then we would have to provide special code to implement
simplifications due to associative law, distributive law, etc.

For Expression(Integer) user expectation is that say 'x*(y + z)'
is equal to 'y*x + x*z' and we implement equality so that this
is true. For logic formula user do not expect such simplifications,
so it is enough to provide equality essentialy as trees. I other
words for your purpose you can just use equality from representation.
Of course, you could get more ambitious and say that two formulas
are equal in your domain if and only if they are logically equivalent.
For pure Boolean propositional logic there are representations with
such property, namely disjuntive (or conjunctive) normal forms
or binary decision diagrams. But if you enrich your logic
a bit then you would have similar (probably worse) problems
to the ones appearing in Expression(Integer).

Conserning preformance issues: AFAIK fast theorem provers use
specialized representation based on normal forms or binary decision
diagrams. And they spent a lot of attention on low level details.
In general purpose system like FriCAS efficiency in normally
lower than in specialized systems. We could try to provide
an optimized domain which internally uses representation like
fast provers. But providing something slower but
working already would be a progress compared to current
situation (that is no domain for logical formulas).

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Bill Page

unread,
Apr 5, 2015, 6:01:49 PM4/5/15
to fricas-devel
On 5 April 2015 at 12:32, Martin Baker <ax8...@martinb.com> wrote:
> I don't know if this is relevant to your discussion about kernels (most of
> it goes over my head) but I thought I might mention something I would like
> to implement, at some time in the future, in case there is some requirement
> that is relevant?
>
> Some time ago I implemented intuitionistic logic with a rep like this:
>
> ILogic() : Exports == Implementation where
> ....
> Rep := Union(_
> const : Record(val : Symbol), _
> var : Record(str : String), _
> binaryOp : Record(typ : Symbol, c1 : %, c2 : %), _
> unaryOp : Record(typ : Symbol, c1 : %)_
> )
>
> I would like to make this more general and allow many types of propositional
> logic and associated algebras such as boolean, (co)Heyting algebra and so
> on.
>
> In OpenAxiom Gaby implemented PropositionalFormula using Kernel like this:
>
> PropositionalFormula(T: SetCategory): Public == Private where
> ....
> Rep == Union(T, Kernel %)
>
> Would it be better if I used Kernel like this?

Kernel provides the means by which a symbolic operator like 'sin' can
be applied to a list of arguments. This is where things like

sin(x)

come from. We can also define new operators and use them in expressions like

f:=operator 'f
f(x)

The Expression domain in FriCAS consists of ratios of polynomials and
a given set of kernels with arguments from Expression (defined
recursively). In your case you might have operators for each
"binaryOp" and "unaryOp". Operators have user specifiable mechanisms
for evaluation, printing etc. so I think it might not be so hard to
use this for your purposes.

> I get the impression most of
> the issues you are discussing are around caching.

One of the important aspects of Kernel is that it a "CachableSet"
which means that a special sorted table is constructed consisting of
unique (not equal) kernels in a given order. Each such kernel is
assign a unique integer encoding its position in this list. I am
especially interested in alternate means of constructing this table or
something equivalent.

> I suspect there would not
> be the same performance issues for discrete algebra like this? I might even
> question if FriCAS/pan-axiom is the way to go for discrete algebra? I
> suspect you are optimising for more numerical based algebra?

The terminology "numerical based algebra" and "discrete algebra" seems
strange but I think I know what you mean. I would say "no" the
concept of a CachableSet is not just for "numerical based algebra".
It is much more general than that.

>
> Apologies if I have misunderstood your discussion, my SPAD skills are
> getting a bit rusty.
>

It would be nice to have you back working on FriCAS. :)

Bill Page

unread,
Apr 5, 2015, 8:09:33 PM4/5/15
to fricas-devel
On 5 April 2015 at 04:49, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> ...
> In general I agree with Waldek that categories shouldn't have too many
> requirements for their arguments. And as for Eltable(A,B), yes,
> mathematically it sounds a bit odd that one can have a function
>
> elt: (%, A) -> B
>
> even though one doesn't know whether the elements of A can be
> distinguished via equality or not.

The function 'elt' itself is not so strange, but what it claims to
represent might be. The existence of 'elt' is supposed to mean that a
value from the domain % (i.e. any domain which satisfies
'Eltable(A,B)') can be viewed as representing a function A->B.

> But on the other hand, look at a
> monoid M. It can be considered as a category with just one object X,
> where the morphisms are the elements of M. Bill knows certainly enough
> category theory to see that there is no need for equality on X to know
> what the arrow
>
> m: X -> X (m \in M)
>
> is.

I am not sure whether or not you are being serious here. There is
certainly a need for equality on morphisms. You are right that
mathematical category theory does not assume any structure on objects.
But I do not see what point you are making. Do you imagine that there
is some application of the FriCAS cateogry 'Eltable' independent of
how it might be used in a domain?

Bill Page

unread,
Apr 5, 2015, 8:13:11 PM4/5/15
to fricas-devel
On 5 April 2015 at 05:03, Ralf Hemmecke <ra...@hemmecke.org> wrote:
>
> In fact, in the long run, I would like to remove non-constructive
> functions. Or at least adapt their specification. Equality of streams,
> for example, is not computable in general. So Stream(X) should not
> inherit from BasicType.

I think that that is an odd suggestion since equality in Expression is
also not computable in general.

> We can replace that with another function
>
> eq: (%, %) -> Union(Boolean, "failed")
>
> whose specification explicitly says that eq(x,y) might run forever or
> return "failed", if the respective algorithm isn't able to decide about
> equality. But that's for the future since it needs quite some thought
> about what other part of FriCAS it affects.
>

Perhaps there is a need for some relations that are not just true or
false in a some conventional closed-world Aristotelian logic sense.

Bill Page

unread,
Apr 5, 2015, 9:23:08 PM4/5/15
to fricas-devel
On 5 April 2015 at 05:24, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> On 04/05/2015 07:37 AM, Bill Page wrote:
>> After reviewing all of the discussion in this thread concerning
>> XHashTable, I find that I now agree with Ralf's original response.
>> Perhaps it is not a good design decision to pass a custom equality
>> when creating an XHashTable.
>
> Oh... and I've already started trying to generalize XHashTable. :-( ;-)
>

No problem. I do not feel strongly on this issue. If you are
convinced that generalization is a good idea, by all means continue.
My conclusion however was that Waldek was right to say initially that
this is not how it is or should be done in FriCAS/Axiom. In general I
think the fact that sometimes things you think you want to do in
FriCAS turn out to be quite hard is because for the most part there is
only one "right way" to do things and the problem is to find this one
right way. It seems to me that this is true by design. I recall at
least one paper that referred to this philosophy as being "Correct by
Design" rather than trying to prove that some piece of code is correct
after the fact.

>> In fact, I would go further and claim that this also applies to the
>> passing of a custom hash function which is permitted in the current
>> implementation of XHashTable. This "extra" feature of XHashTable
>> should be removed.
>
> Yes and no. However, even though I don't yet have a particular use case,
> I could imagine that for certain situations it makes sense to create a
> hash function which is tailored to the problem at hand.

Of course this might be true for "certain situations". But this is
also just as true for the function that represents equality. In fact
it was just such a situation that motivated me to propose this
extension to XHashTable. Since you obviously felt this way about the
hash function it came as a shock to me that you did not immediately
support this for same sort of thing for equality.

> For example,
> suppose you want create a hash table where the keys are strings that
> alll start with the string "http://.../" where the ... stands for a
> rather long URL. Of course, one can cut that part of the string, but one
> could also use another hash function.
>

Sure, but my response here could be exactly the same as your original
response to me concerning equality: Your example suggests that you are
using the wrong representation. If the strings all start with
"http;// ... /" then maybe you should first abstract that part away
and implement a "URL" domain that stores only the variable part of the
url.

> More importantly, in some cases the default hash function of the domain
> would not yield equal distribution of the keys, since only a specific
> part of the elements will ever appear as keys. Admittedly, creating such
> a specific hash function would be an art, but that is the reason for
> allowing a hash function as a parameter to table.

And for some reason you do not think this is possible with equality?

>
> However, if Waldek agrees, I see currently no big problem in dropping
> this "extra feature".
>

Well, I think the motivation should come from a design philosophy
rather than simple expediency.

>> It seemed overly ridged at first to insist on enforcing the contract
>> between the = and hash functions at the category level because this
>> forces the creation of a new domain in the case I had in mind, i.e.
>> using XHashTable in Kernel instead of SortedCache. But now I begin
>> to see this as the best option. Perhaps what I want is something we
>> might call "InnerKernel" which exports a much simpler (cheaper) kind
>> of equality and an associated hash function. Then I would use
>> InnerKernel together with XHashTable in the Kernel domain itself to
>> implement a more "mathematical" equality. And now I also think Ralf
>> and Waldek were right to claim that really this has more to do with
>> representation than I was initially willing to admit.
>
> Yes, that's what I said about creating ExpressionRep. Well, you want
> InnerKernel, but that goes probably in the same direction.

Yes, exactly the same direction.

> Caching the
> representation of two (differnently looking) kernels that are
> mathematically equal, can be done with XHashTable if = means equality of
> representation. That caches a few more kernels, but otherwise would be
> sound wrt. XHashTable's requirements.

If you mean caches a few more kernels than currently cached using
SortedCache then yes, but I think that is unavoidable. The point is
that I want to use XHashTable to *define* equality in Kernel.
Obviously I can't do that if XHashTable depends on equality already
being defined in Kernel. CachableSet and SortedCache expects the
equivalent of an uncached equality function to be passed as part of
the insertion operator (enterInCache). That is more or less what I had
in mind when I proposed a similar extension to XHashTable. But now I
think the basic design of Kernel is flawed.

>
> Would be somehow nices to "collaps"/"replace" two (mathematically) equal
> kernels by a "dominant" one, if the two are recognized as being equal.
>

Well, um that is exactly the content of my algorithm and patch I have
been discussing with Waldek ...

> Bill, maybe you try to come up with a suggestion that introduces
> InnerKernel.

??? Yes, maybe I can ...

Bill Page

unread,
Apr 5, 2015, 9:36:02 PM4/5/15
to fricas-devel
On 5 April 2015 at 11:50, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> Bill Page wrote:
>>
>> Reading the definition:
>>
>> ++ An eltable over domains D and I is a structure which can be viewed
>> ++ as a function from D to I.
>> ++ Examples of eltable structures range from data structures, e.g. those
>> ++ of type \spadtype{List}, to algebraic structures, e.g. \spadtype{Polynomial}.
>> Eltable(S : SetCategory, Index : Type) : Category == with
>> elt : (%, S) -> Index
>> ++ elt(u, i) (also written: u . i) returns the element of u indexed by i.
>> ++ Error: if i is not an index of u.
>>
> ...
>> In any case, if a domain D does not have some concept of equality =
>> (BasicType) does it make sense to talk about "a function from D to I"?
>>
>> Are there any domains in FriCAS that one might conceivably use as an
>> index domain which does not export equality?
>
> Note that you wrote "a function". 'elt' is just a function, like
> any other. So you ask if all domains should export equality?

No. I think you misunderstood my point. Eltable is supposed to mean
that any domain % that satisfies Eltable(D,I) has values "which can be
viewed as a function from D to I". So it is not 'elt' itself but
rather a value from % that needs to behave in this way. 'elt' is just
the interface to this aspect of %. And as far as I can see having
equality in D is an important part of this.

> If equality in not computable, than having no equality is
> natural.
>

Can you give an example?

>> I guess I don't understand the desire to "generalize TableAggregate to
>> require only Type on first parameter.". Is this just generalization
>> for it's own sake or is there some conceivable use? Backing this up, I
>> agree with Ralf that I also "would find it a little odd, if XHashTable
>> cannot inherit from TableAggregate if Key is just of type Type". From
>> my point of view this is a good reason to keep the requirement that
>> Key has SetCategory even though I want to be able to pass my own
>> custom equality when creating an XHashTable. Otherwise this would
>> push the support for passing a custom equality all the way down to
>> other domains which have TableAggregate.
>
> No. We are talking about _categories_ here. Any domain exporting
> given category may impose stronger condition on parameters, and
> all domains which need eqality in the parameter will still
> require this. Modifying category means that domain like
> XHashTable may have the same category like other domains,
> without requiring any change in those domains. That is why
> having weak conditions on parameters to categories make sense:
> domain may always add extra restrictions to parameters, but
> can not drop restricions introduced in the category.
>

So how can the XHashTable domain tell TableAggregate whether or not to
export 'table(hash,eq)' with two arguments but not to expect the same
thing from a domain like AssiationList? Of course maybe it should
also be available in AssociationList. Is that your point?
> --
> You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fricas-devel...@googlegroups.com.
> To post to this group, send email to fricas...@googlegroups.com.
> Visit this group at http://groups.google.com/group/fricas-devel.
> For more options, visit https://groups.google.com/d/optout.

Bill Page

unread,
Apr 6, 2015, 12:31:43 AM4/6/15
to fricas-devel
On 5 April 2015 at 17:43, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> ...
> The discussion is more about equality and builtin simplifications.
> We could use representation like above for Expression(Integer).
> But then we would have to provide special code to implement
> simplifications due to associative law, distributive law, etc.
>

I agree that the main point of the discussion is equality. Although
caching is involved, performance is not the main issue. As I now
understand it, the point of caching kernels is to dynamically
establish a set of canonical representatives.

> For Expression(Integer) user expectation is that say 'x*(y + z)'
> is equal to 'y*x + x*z' and we implement equality so that this
> is true.

At the risk of unnecessarily complicating your example which I presume
was intended to address other issues, I would say that it is not
accurate to say that "we implement equality so this is true". In fact
it is not so easy even to represent 'x*(y+z)' as something of type
Expression and if we try we get:

(1) -> y*x+x*z=x*paren(y+z)

(1) x z + x y= x(z + y)
Type: Equation(Expression(Integer))
(2) -> test %

(2) false
Type: Boolean

It is a good question perhaps whether or not it is a good thing that
Expression considers these two expressions not equal.

This needs some explanation. Expression is based first of all on
polynomials (SparseMultivariatePolynomial to be precise). In this
domain multiplication necessarily distributes over addition:

(3) -> x*(y+z)

(3) x z + x y
Type: Polynomial(Integer)

So at a fundamental level this equality is inevitable

(4) -> y*x+x*z=x*(z+y)

(4) x z + x y= x z + x y
Type: Equation(Polynomial(Integer))

In order to write 'x*(z+y)' in Expression and have it stay that way we
need the help of a special operator, i.e. 'paren', which really does
nothing but displays as parenthesis. The representation here is still
as a polynomial but now, in addition to variables x,y, and z, we also
have the kernel 'paren(z+y)' which at the level of polynomials is
treated like just another variable.

But Expression does provide

(5) -> distribute(x*paren(y+z))

(5) x z + x y
Type: Expression(Integer)

> For logic formula user do not expect such simplifications,
> so it is enough to provide equality essentially as trees. I other
> words for your purpose you can just use equality from representation.
> Of course, you could get more ambitious and say that two formulas
> are equal in your domain if and only if they are logically equivalent.
> For pure Boolean propositional logic there are representations with
> such property, namely disjunctive (or conjunctive) normal forms
> or binary decision diagrams. But if you enrich your logic
> a bit then you would have similar (probably worse) problems
> to the ones appearing in Expression(Integer).
>

Yes. Although FriCAS does do a remarkably good job of defining
equality in Expression in terms of canonical forms - especially as
extended for kernels, I am unconvinced that this is possible in
general, that is for the case of "mathematical equality". Worse I
suppose, is that I do not know when or how the current algorithm might
fail. I am interested instead in the details of efficiently
representing equality as a quotient (at least in part). A simple
example of this might be representing equality in formal fractions
without the using GCD cancellation to obtain a canonical form. Or this
old example of defining integers in terms of natural numbers:

http://axiom-wiki.newsynthesis.org/SandBoxDefineInteger

Waldek Hebisch

unread,
Apr 6, 2015, 6:38:13 PM4/6/15
to fricas...@googlegroups.com
Bill Page wrote:
>
> On 5 April 2015 at 17:43, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> > ...
> > The discussion is more about equality and builtin simplifications.
> > We could use representation like above for Expression(Integer).
> > But then we would have to provide special code to implement
> > simplifications due to associative law, distributive law, etc.
> >
>
> I agree that the main point of the discussion is equality. Although
> caching is involved, performance is not the main issue. As I now
> understand it, the point of caching kernels is to dynamically
> establish a set of canonical representatives.

Yes.

> > For Expression(Integer) user expectation is that say 'x*(y + z)'
> > is equal to 'y*x + x*z' and we implement equality so that this
> > is true.
>
> At the risk of unnecessarily complicating your example which I presume
> was intended to address other issues, I would say that it is not
> accurate to say that "we implement equality so this is true". In fact
> it is not so easy even to represent 'x*(y+z)' as something of type
> Expression

OK, I was oversimplifying. One important point it that if we
consider two expressions as equal, then there is no need to
represent differences between them. We may just store
some standarized version. In particular we may go to
so called normal form or canonical form. This is explanined
in chapter 2 of book by J.H. Davenport, Y. Siret, E. Tournier,
"Computer Algebra -- Systems and Algorithms for Algebraic Computation"
which is available at

http://staff.bath.ac.uk/masjhd/masternew.pdf

Also concept of normal and canonical forms is explained
if A = B by Petkovsek, Wilf and Zeilberger.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Waldek Hebisch

unread,
Apr 6, 2015, 7:07:09 PM4/6/15
to fricas...@googlegroups.com
Hmm. I do not get your point. If some domain or package exports

foo : D -> Something

then 'foo' can be viewed as a function form D to Something. So
I do not get why you think that 'elt' needs equality in D but
accept 'foo' for D with no equality.

> > If equality in not computable, than having no equality is
> > natural.
> >
>
> Can you give an example?

Canonical example are functions given by their code. Also
various lazy infinte structures like computable reals.
Ralf mentioned streams and series: currently we cheat and
provide fake equality but since real equality is not
computable it would be more natural to provide none.

> >> I guess I don't understand the desire to "generalize TableAggregate to
> >> require only Type on first parameter.". Is this just generalization
> >> for it's own sake or is there some conceivable use? Backing this up, I
> >> agree with Ralf that I also "would find it a little odd, if XHashTable
> >> cannot inherit from TableAggregate if Key is just of type Type". From
> >> my point of view this is a good reason to keep the requirement that
> >> Key has SetCategory even though I want to be able to pass my own
> >> custom equality when creating an XHashTable. Otherwise this would
> >> push the support for passing a custom equality all the way down to
> >> other domains which have TableAggregate.
> >
> > No. We are talking about _categories_ here. Any domain exporting
> > given category may impose stronger condition on parameters, and
> > all domains which need eqality in the parameter will still
> > require this. Modifying category means that domain like
> > XHashTable may have the same category like other domains,
> > without requiring any change in those domains. That is why
> > having weak conditions on parameters to categories make sense:
> > domain may always add extra restrictions to parameters, but
> > can not drop restricions introduced in the category.
> >
>
> So how can the XHashTable domain tell TableAggregate whether or not to
> export 'table(hash,eq)' with two arguments but not to expect the same
> thing from a domain like AssiationList? Of course maybe it should
> also be available in AssociationList. Is that your point?

'table(hash,eq)' should be exported by XHashTable. We do not need
such export in TableAggregate. I we wished to create a more
general concept, allowing association lists with arbitrary
equality then we would introduce extra category. Alternatively,
if you think that all aggregates with search-like functionality
should allow 'eq', then you may go trough all domains and
adjust them. But association list should have no need of
'hash', so this does not look like good generalization.

AFAICS modification to XHashTable can be done without touching
any other domain, in particular other domains will not get any
extra exports. We need to adjust categories, but relaxed
parameters do not affect any existing domain.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Waldek Hebisch

unread,
Apr 6, 2015, 7:12:25 PM4/6/15
to fricas...@googlegroups.com
Bill Page wrote:
>
> On 5 April 2015 at 05:24, Ralf Hemmecke <ra...@hemmecke.org> wrote:
> > On 04/05/2015 07:37 AM, Bill Page wrote:
> >> After reviewing all of the discussion in this thread concerning
> >> XHashTable, I find that I now agree with Ralf's original response.
> >> Perhaps it is not a good design decision to pass a custom equality
> >> when creating an XHashTable.
> >
> > Oh... and I've already started trying to generalize XHashTable. :-( ;-)
> >
>
> No problem. I do not feel strongly on this issue. If you are
> convinced that generalization is a good idea, by all means continue.
> My conclusion however was that Waldek was right to say initially that
> this is not how it is or should be done in FriCAS/Axiom.

I think that generalizing XHashTable is a good idea. OTOH I
think that trying to use it to implement Kernel (or Expression)
has serious problems.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Bill Page

unread,
Apr 7, 2015, 11:52:32 AM4/7/15
to fricas-devel
On 6 April 2015 at 19:12, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
>
> I think that generalizing XHashTable is a good idea. OTOH I
> think that trying to use it to implement Kernel (or Expression)
> has serious problems.
>

Besides the fact that there is no obvious efficient way to represent
an order relation on the keys in a hash table, what other serious
problems do you forsee?

Bill Page

unread,
Apr 7, 2015, 12:37:16 PM4/7/15
to fricas-devel
> then 'foo' can be viewed as a function from D to Something. So
> I do not get why you think that 'elt' needs equality in D but
> accept 'foo' for D with no equality.
>

I am not concerned much about defining such a function, though maybe
it would be a bit more clear if we had

elt: % -> (S -> I)

What bothers me more is the meaning of the category Eltable. In
typical Axiom style we are not told very much about what it means. In
some cases, e.g. Ring, not much more is needed than a reference to
ring as a well-known mathematical object. But really we should
probably expect a set of axioms to be associated with each category.
What axioms should we associate with Eltable? We are only told that
the members of % are "data structures" and "algebraic structures' that
have some kind of "indexed elements". My point is that I suspect that
we can not specify reasonable axioms for Eltable without requiring
equality on S. Maybe I am wrong.

>> > If equality in not computable, than having no equality is
>> > natural.
>> >
>>
>> Can you give an example?
>
> Canonical example are functions given by their code. Also
> various lazy infinite structures like computable reals.
> Ralf mentioned streams and series: currently we cheat and
> provide fake equality but since real equality is not
> computable it would be more natural to provide none.

Then this should also apply to Expression, no? If not, then why can we
not do for these other domains what we do for Expression?

>>
>> So how can the XHashTable domain tell TableAggregate whether or not to
>> export 'table(hash,eq)' with two arguments but not to expect the same
>> thing from a domain like AssiationList? Of course maybe it should
>> also be available in AssociationList. Is that your point?
>
> 'table(hash,eq)' should be exported by XHashTable. We do not need
> such export in TableAggregate. I we wished to create a more
> general concept, allowing association lists with arbitrary
> equality then we would introduce extra category. Alternatively,
> if you think that all aggregates with search-like functionality
> should allow 'eq', then you may go trough all domains and
> adjust them. But association list should have no need of
> 'hash', so this does not look like good generalization.
>
> AFAICS modification to XHashTable can be done without touching
> any other domain, in particular other domains will not get any
> extra exports. We need to adjust categories, but relaxed
> parameters do not affect any existing domain.
>

So, with TableAggregate(Key: Type,Entry: Type), if we specify a Key
that does not export =$key will TableAggregate still export 'table()'
with no arguments? If so, what will XHashTable do to implement it?

Bill Page

unread,
Apr 7, 2015, 1:51:04 PM4/7/15
to fricas-devel
On 6 April 2015 at 18:38, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> Bill Page wrote:
>>
>> At the risk of unnecessarily complicating your example which I presume
>> was intended to address other issues, I would say that it is not
>> accurate to say that "we implement equality so this is true". In fact
>> it is not so easy even to represent 'x*(y+z)' as something of type
>> Expression
>
> OK, I was oversimplifying. One important point it that if we
> consider two expressions as equal, then there is no need to
> represent differences between them. We may just store
> some standarized version. In particular we may go to
> so called normal form or canonical form. This is explanined
> in chapter 2 of book by J.H. Davenport, Y. Siret, E. Tournier,
> "Computer Algebra -- Systems and Algorithms for Algebraic Computation"
> which is available at
>
> http://staff.bath.ac.uk/masjhd/masternew.pdf
>

Yes, I think this is a bit dated but still a good reference. Thank
you. For example:

page 80, quote:
Brown's storage method [1969] raises this question of regularity. Given
a normal representation, he proposes to construct a canonical representa-
tion. All the expressions which have already been calculated are stored,
a_1, ... a_n. When a new expression b is calculated, for all the a_i we test
whether b is equal to this a_i (by testing whether a_i−b i s zero). If b=a_i,
we replace b by a_i ,otherwise a_{n+1} becomes b, which is a new expression.
This method of storing yields a canonical representation, which, however,
is not at all regular, because it depends entirely on the order in which the
expressions appear. Moreover, it is not effi cient, for we have to store all the
expressions and compare each result calculated with all the other stored
expressions.
end_quote

Bill Page

unread,
Apr 7, 2015, 5:07:47 PM4/7/15
to fricas-devel
Some people might be interested in a draft new version of a similar
book by Davenport here:

http://staff.bath.ac.uk/masjhd/JHD-CA.pdf

It is incomplete in a few places but contains a lot of new material.

Waldek Hebisch

unread,
Apr 8, 2015, 6:21:07 PM4/8/15
to fricas...@googlegroups.com
1) You can use equality in axioms without having equality as
export.
2) I am not sure we can give nice axioms for Eltable. To say
the truth I am affraid that nice axioms are just product
of sloppy way of developing math. More precisely, we
can give nice axioms for isolated domains. But when we
try to fit them together we frequently have cases when
we want to use code even when assumptions are violated.
Case in point is Euclid's algorithm. Theoreticaly we
need EuclideanDomain, that is ring without zero divisors
to use it. But we may try to use it for polynomials over
product of fields. If we hit zero divisor, then we fail.
But if all divisons work then we get GCD.
3) Equality is fundamental to math, frequently it is considered
part of logic. Without equality it is hard to formulate
any axioms. So still I do not see why you worry about
Eltable and accept categories and domais without equality.

> >> > If equality in not computable, than having no equality is
> >> > natural.
> >> >
> >>
> >> Can you give an example?
> >
> > Canonical example are functions given by their code. Also
> > various lazy infinite structures like computable reals.
> > Ralf mentioned streams and series: currently we cheat and
> > provide fake equality but since real equality is not
> > computable it would be more natural to provide none.
>
> Then this should also apply to Expression, no? If not, then why can we
> not do for these other domains what we do for Expression?

Expression is at edge of computability: on large subsets mathematical
equality is decidable. And current equality is a well defined
equivalence relation which agrees with mathematical equality
on many important subsets. We do not know is something similar
is possible for other domais (and if possible how to do it).

> > AFAICS modification to XHashTable can be done without touching
> > any other domain, in particular other domains will not get any
> > extra exports. We need to adjust categories, but relaxed
> > parameters do not affect any existing domain.
> >
>
> So, with TableAggregate(Key: Type,Entry: Type), if we specify a Key
> that does not export =$key will TableAggregate still export 'table()'
> with no arguments? If so, what will XHashTable do to implement it?

Well, 'table()' should be a conditional export of TableAggregate...

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Bill Page

unread,
Apr 10, 2015, 2:10:49 PM4/10/15
to fricas-devel
On 8 April 2015 at 18:20, Waldek Hebisch <heb...@math.uni.wroc.pl> wrote:
> Bill Page wrote:
>> ...
>> What bothers me more is the meaning of the category Eltable. In
>> typical Axiom style we are not told very much about what it means.
>> In some cases, e.g. Ring, not much more is needed than a reference
>> to ring as a well-known mathematical object. But really we should
>> probably expect a set of axioms to be associated with each category.
>> What axioms should we associate with Eltable? We are only told that
>> the members of % are "data structures" and "algebraic structures' that
>> have some kind of "indexed elements". My point is that I suspect that
>> we can not specify reasonable axioms for Eltable without requiring
>> equality on S. Maybe I am wrong.
>
> 1) You can use equality in axioms without having equality as
> export.

The concept of using equality in the definition of a category without
exporting it strongly contrasts your view of the FriCAS/Axiom library
from my own. I specifically said library and not just SPAD because
you are perfectly correct that it is possible to do anything you like
when using SPAD as a general purpose programming language, say for
writing a accounting application. But in the design of the FriCAS
library it seems to me very natural to think of every domain and
category as a mathematical object - not just *representing* some
mathematics but actually *being* a concrete realization of some
mathematical concept. Of course to a greater or lesser extent the
identification of FriCAS objects as mathematical depends on one's
imagination but this is strongly aided by the association of exported
operations with essential attributes of objects. In essence an object
is defined by what it exports (and a set of axioms).

So for example among a few other things RING exports:

?*? : (%,%) -> %
?+? : (%,%) -> %
-? : % -> %
?=? : (%,%) -> Boolean
1 : () -> %
0 : () -> %

with these axioms:

For all a, b in %, the result of the operation a + b is also in %.
For all a, b in %, the result of the operation a * b is also in %.

There exists an element 0 in %, such that for all elements a in %, the
equation 0 + a = a + 0 = a holds.
For all a, b and c in %, the equation (a + b) + c = a + (b + c) holds.
For all a and b in %, the equation a + b = b + a holds.
For each a in %, there exists an element -a in R such that a + (-a) =
0, where 0 is the additive identity element.

There exists an element 1 in %, such that for all elements a in %, the
equation 1 * a = a * 1 = a holds.
For all a, b and c in %, the equation (a * b) * c = a * (b * c) holds.
For all a and b in %, the equation a * b = b * a holds.

For all a, b and c in %, the equation (a + b) * c = (a * c) + (b * c) holds.
For all a, b and c in %, the equation a * (b + c) = (a * b) + (a * c) holds.

These axioms can all be expressed in terms of just those the operators
that are exported. It would seem exceedingly strange to me if Ring did
not export =.

> 2) I am not sure we can give nice axioms for Eltable. To say
> the truth I am afraid that nice axioms are just product
> of sloppy way of developing math.

That is a remarkable statement!! Of course I did not claim that the
axioms had to be "nice" whatever that means.

> More precisely, we
> can give nice axioms for isolated domains. But when we
> try to fit them together we frequently have cases when
> we want to use code even when assumptions are violated.

For me this would mean that it was time to review the axioms.

> Case in point is Euclid's algorithm. Theoretically we
> need EuclideanDomain, that is ring without zero divisors
> to use it. But we may try to use it for polynomials over
> product of fields. If we hit zero divisor, then we fail.
> But if all divisons work then we get GCD.

Now you introduce a new concept: "fail". To me this means that the
logic is no longer simply Aristotelian (i.e. every statement is either
"true" or "false"). In principle I have no objection to this but I
think it deserves to be reconized explicitly and for the most part in
FriCAS it is via the 'Union(%,"failed")' construct.

> 3) Equality is fundamental to math, frequently it is considered
> part of logic. Without equality it is hard to formulate
> any axioms. So still I do not see why you worry about
> Eltable and accept categories and domains without equality.
>

Yes you are right. In fact it seems to me that a domain without any
kind of equality is a kind of lie. But a fundamental division in the
FriCAS library occurs at the top of the category lattice heterarchy
between BasicType and Aggregate. Everything below BasicType exports =
while those below Aggregate export

eq? : (%,%) -> Boolean

where

eq?(x,y) -> true if x is identical to y, otherwise false

Presumably 'eq?' is often implemented as just EQ$Lisp. Aggregates are
supposedly "data structure"-like objects with things of BasicType are
supposedly "algebraic". But some domains are both, e.g. Set.

My point here is to refer again to the article by Davenport "Equality
in Computer Algebra and Beyond".

It is not clear to me whether or not Eltable is intended to be "data
structure"-like or not and whether the index domain S should also be
just "data structure"-like or mathematical. Right now the definition
says it should be mathematical.

>> >> > If equality in not computable, than having no equality is
>> >> > natural.
>> >> >
>> >>
>> >> Can you give an example?
>> >
>> > Canonical example are functions given by their code. Also
>> > various lazy infinite structures like computable reals.
>> > Ralf mentioned streams and series: currently we cheat and
>> > provide fake equality but since real equality is not
>> > computable it would be more natural to provide none.
>>
>> Then this should also apply to Expression, no? If not, then why can we
>> not do for these other domains what we do for Expression?
>
> Expression is at edge of computability: on large subsets mathematical
> equality is decidable. And current equality is a well defined
> equivalence relation which agrees with mathematical equality
> on many important subsets. We do not know is something similar
> is possible for other domains (and if possible how to do it).
>

If equality really is not computable than maybe it would be more
"natural" if equality was defined as

= : (?,?) -> Union(Boolean,"failed")

>> > AFAICS modification to XHashTable can be done without touching
>> > any other domain, in particular other domains will not get any
>> > extra exports. We need to adjust categories, but relaxed
>> > parameters do not affect any existing domain.
>> >
>>
>> So, with TableAggregate(Key: Type,Entry: Type), if we specify a Key
>> that does not export =$key will TableAggregate still export 'table()'
>> with no arguments? If so, what will XHashTable do to implement it?
>
> Well, 'table()' should be a conditional export of TableAggregate...
>

So this means that TableAggregate only provides a means to create
members of % if Key is of type SetCategory, otherwise creating members
is somehow unique to the domain. This works but I am not sure if we
should consider it fully satisfactory.

Bill.

Waldek Hebisch

unread,
Apr 11, 2015, 9:29:33 PM4/11/15
to fricas...@googlegroups.com
I agree that this is nice to have close correspondence
between exported signatures and axioms. In particular
one try to use axioms which are just equalities valid
for all tuples of arguments and use signatures to
like '0' ore 'inv' to constructively represent
existential axioms. But I am affraid that consistently
requesting this is too limiting.

> > 2) I am not sure we can give nice axioms for Eltable. To say
> > the truth I am afraid that nice axioms are just product
> > of sloppy way of developing math.
>
> That is a remarkable statement!! Of course I did not claim that the
> axioms had to be "nice" whatever that means.
>
> > More precisely, we
> > can give nice axioms for isolated domains. But when we
> > try to fit them together we frequently have cases when
> > we want to use code even when assumptions are violated.
>
> For me this would mean that it was time to review the axioms.

Yes, in principle we should do this. But then axioms no
longer are nice. In particular, because they are different
than texbook version, we need to develop our own theory.
Of course for affected operations we need to prove that
they work anyway. But type correctness may force us to
propagate changes to seemingly unrelated places. Currently
one approach I use is to call 'error' in case when given
piece of code is called when real assumptions (the ones
beyond official signature) are not satisfied. But this
clearly breaks encoding of existential quantification
via signatures: since we no longer know if given function
will return, we do not know if the corresponging object
exists.

>
> > Case in point is Euclid's algorithm. Theoretically we
> > need EuclideanDomain, that is ring without zero divisors
> > to use it. But we may try to use it for polynomials over
> > product of fields. If we hit zero divisor, then we fail.
> > But if all divisons work then we get GCD.
>
> Now you introduce a new concept: "fail". To me this means that the
> logic is no longer simply Aristotelian (i.e. every statement is either
> "true" or "false"). In principle I have no objection to this but I
> think it deserves to be reconized explicitly and for the most part in
> FriCAS it is via the 'Union(%,"failed")' construct.

Well, using 'Union(%,"failed")' leads to correct code, but the
code is longer and more complicated than uncoditional code.
Also, "failed" may propagate to different places. So to
limit profileration of "failed" we may be forced to have two
versions: unconditional one (without failed) and the second
one using "failed". But now we loose advantage of generic
coding: the two versions are essentially identical, but use
different types so we need two separate copies of the code.


>
> > 3) Equality is fundamental to math, frequently it is considered
> > part of logic. Without equality it is hard to formulate
> > any axioms. So still I do not see why you worry about
> > Eltable and accept categories and domains without equality.
> >
>
> Yes you are right. In fact it seems to me that a domain without any
> kind of equality is a kind of lie. But a fundamental division in the
> FriCAS library occurs at the top of the category lattice heterarchy
> between BasicType and Aggregate. Everything below BasicType exports =
> while those below Aggregate export
>
> eq? : (%,%) -> Boolean
>
> where
>
> eq?(x,y) -> true if x is identical to y, otherwise false
>
> Presumably 'eq?' is often implemented as just EQ$Lisp. Aggregates are
> supposedly "data structure"-like objects with things of BasicType are
> supposedly "algebraic". But some domains are both, e.g. Set.

I must admit that I am not sure why 'eq?' is exported for 'Aggregate'
but not for other types. One guess may be that 'eq?' is needed
to detect cyceles in lists and graphs. It is also useful
for speeding up equality of aggregates (but may be used to
speed up equality for most types, so why no 'eq?' for other types).

> My point here is to refer again to the article by Davenport "Equality
> in Computer Algebra and Beyond".
>
> It is not clear to me whether or not Eltable is intended to be "data
> structure"-like or not and whether the index domain S should also be
> just "data structure"-like or mathematical. Right now the definition
> says it should be mathematical.

I do not think there is really any deep distinction between
"data structures" and "mathematical" domains. There is theory
of lists and array, not very deep but comparable to begining
theory of groups. And sets are both a data structuture and
a rich mathematical object.

>
> If equality really is not computable than maybe it would be more
> "natural" if equality was defined as
>
> = : (?,?) -> Union(Boolean,"failed")

Maybe. But any attempt at equality would be on "best effort"
basis. Also there are cases where failure is the expected
outcome, for example for functions given by code. So unlike
cases I metioned above where failure is rare and we can
avoid it via randomization and repeating computations we
can not avoid failure. So if we want algorithm with
guarented result, we can not use equality. And to make
sure we do not use it we should not provide it.

>
> >> > AFAICS modification to XHashTable can be done without touching
> >> > any other domain, in particular other domains will not get any
> >> > extra exports. We need to adjust categories, but relaxed
> >> > parameters do not affect any existing domain.
> >> >
> >>
> >> So, with TableAggregate(Key: Type,Entry: Type), if we specify a Key
> >> that does not export =$key will TableAggregate still export 'table()'
> >> with no arguments? If so, what will XHashTable do to implement it?
> >
> > Well, 'table()' should be a conditional export of TableAggregate...
> >
>
> So this means that TableAggregate only provides a means to create
> members of % if Key is of type SetCategory, otherwise creating members
> is somehow unique to the domain. This works but I am not sure if we
> should consider it fully satisfactory.

Actually I am affraid we can not prevent creating tables just by
restrictng signatures: Aggregate unconditionally exports 'empty()',
so trying to create empty hash table will be allowed by type
system and all we can do it to singal error when Key does not
have BasicType.

--
Waldek Hebisch
heb...@math.uni.wroc.pl
Reply all
Reply to author
Forward
0 new messages