Draft for generics

15 views
Skip to first unread message

Qtvali

unread,
Dec 20, 2009, 2:30:35 PM12/20/09
to golang-nuts
I did read code examples and some argumentation from PDF document
about generics, which Russ had put here. I did notice something, which
could be expressed by paraphrasing Lisp's "slogan" - all those
implementations are "slow, buggy and complicated implementations of
half of Prolog". To make it clear - I have no experience with Prolog.
I just know the basics.

Citate: Any sufficiently complicated C or Fortran program contains an
ad hoc, informally-specified, bug-ridden, slow implementation of half
of Common Lisp.

Thus, I did arrive to those arguments/implications:
* Prolog is, at this time, optimized and fast query language for
relations between objects
* Templates, in general, mostly rely on clearly defined relations
* There are open source implementations of languages and libraries,
which allow one to use the power of prolog in his own code
* Prolog is advanced and turing-complete language in itself
* One can work out Prolog queries needed to generate code from
templates, not invent the case by himself
* Prolog is old, mature and well-established language with clear
patterns of how to do something and some kind of proof of it's
extensibility and usability
* It's possible to use only few features from prolog in concept's
syntax, choosing the faster and more necessary ones, still being ready
to add others very fast
* Instead of using "==>" kind of syntax, I will make headers of
concepts look similar to imperative commands with their return values.
This doesn't make this semantically less powerful, but it indeed makes
it simpler to use for someone having mostly imperative background; it
also makes Go's syntax more coherent, should this come into use.

This introduction given, I add a draft written as little research into
this possibility. As I write this things in word processor and copy
into email, adding introductions later, I will repeat myself here and
there. One, knowing Prolog a bit, could be sure that implementation of
something similar is relatively simple:

Proposal for generic programming in Go

With generics, one of the primary problem is how to create
constraints.

Constraints in Go are primarily related to interfaces – it must be
possible to force a generic to be compatible with some interface. As
subclasses of generic classes must be able to join several interfaces
into one, generic interface is the first thing to be declared. Lets
call it a “concept”.

I think that logic programming languages like Prolog are extremely
powerful in creating things like that. I don't know if they are fast
enough. Let's try. For speed, I think that compability of several
types could be calculated once and used several times; as I have
created a parser of subset of Prolog (math logic part) in OCaml as
test job to some company [took 4 days using strictly functional
paradigm logic, which was in requirements], I can say it's not too
much to have. I did choose Prolog-like expressions to remove the need
for large number of keywords, also to provide extensibility. Of
course, there could be short-cut syntax for some relations, but I try
to set up semantics first.

First, there should be some general syntax of type comparison.

First thing to be known is if type is struct or interface.
IsStruct([T]);
IsInterface([T]);

Second thing to be known is if type has specific method.
HasMethod([T], lessThan([T]) bool);

Then it's important if some method accepts it as a parameter.
FunctionExists(make([T], int));

Power of Prolog is really seen here...
HasMethod([T], length() [L]);

And could those lengths be compared?
HasMethod([L], Cmp([L]));

And you might not care about return type of a function:
HasMethod([T], run() _);
But you always care about inputs.

The last statement does not need L to be defined as input of type call
– it rather creates it as output. The pattern will be evaluated that
there is a method “L”, which is to be used on “T”.

It's needed to know if T has some interface..
InterfaceApplies([T], Stringer);
InterfaceApplies([T], [N]);

And if it inherits some other type...
StructIsContained([T], BaseType);
StructIsContained([T], [N]);

// Or to not call “IsStruct([T])” implicitly...
InheritsFrom([T], [N]);

And what is returned by built-in compatible constant method:
FunctionExists(unsafe.sizeof[T]() ([l] int));
(we can later declare array of size l in some implementation of this
concept).

Or:
FunctionExists(unsafe.sizeof[T]() ([l]));
StructIsEqual([l].type, int);

Or:
IsPointer(T);
IsNonPointer(T);
To restrict that this concept only applies to pointers or non-
pointers. Pointers have some special uses, also the non-pointers.

Put it all together:
type Z[T] [L, l] concept {
// var L, l;
IsStruct([T]);
HasMethod([T], lessThan([T]) bool);
FunctionExists(make([T], int));
HasMethod([T], length() [L]);
InterfaceApplies([T], Stringer);
StructIsContained([T], BaseType);
FunctionExists(unsafe.sizeof[T]() ([l] int));
// For convinience, such things are possible:
[l].type == int;
}

Now, we can declare a relation:
// Comment: if two classes are friends,
// one can do unsafe copy of their binary
// content safely. We won't use brackets
// as every single combination of classes
// needs this to be declared separately.
type AreFriends(A, B) concept {
};

Which allows us to declare that:
const AreFriends(MyInt, int) concept;

Now we can create a subconcept of Z, which needs “lengths” to also be
friends so that we can copy their content safely.

type T[B, C] [I, J] concept {
ConceptApplies(Z[B]);
Or:
ConceptApplies(Z[B], [L, l]);
// Or just:
Z[B] [L, _];
Z[C] [LC, _];
// Will check if values returned
// by length method of both classes
// are friends (like MyInt and int).
AreFriends(L, LC);
}

Now, what is not clear yet is how to get things going in such case we
would actually accept several different signatures, like we dont care
if there is “length” method or are the classes themselves “lengths”.

This needs “or” and “and” to be declared:

type HasLength[B] [L] concept {
Z[T] [L, _];
}

func [h HasLength] getLength() (len [L]) {
return h.length();
}

/*
Implicitly declares L as float if N is of type Len.
*/
type IsLength[B] [L] concept {
StructIsEqual([B], [L: int]) || StructIsEqual([B], [L: float]) ||
(StructIsEqual([B], [N: Len]) && StructIsEqual([L], float));
getLength() L; // Demands this method to be declared in all cases.
}

/*
StructIsEqual([B], [L: int]) might be fuzzy at first sight, but it
simply declares that there is such subclass of int, which is of type
struct (IsStruct(L) is implied). Type variable N will get a property
that it's a subclass of Len, whenever this StructIsEqual is matched –
as there are no constraints for N, it's simply a placeholder and
compiler might demand replacing it with “_”, because this variable is
not exported or used. Otherwise, IsStringer(N: Len) would need there
to be such parent structure of [B], which is also a stringer (Is*(?)
should be convenient way to write InterfaceApplies([?], *)). Should
there be any need to use this parent structure directly for declaring
types (IsStruct([N]) or IsInterface([N]) or possibly some other set of
functions like IsConcrete([N]) should guarantee that a count of such
structs/interfaces is one), it must be guaranteed that this concept
does not allow any misunderstanding that this is a struct or interface
– it must directly declare that or this must be possible to imply that
(again, Prolog-like syntax makes such implications straightforward).
*/

func [h IsLength[B] [L : int || float; !L: Len]] getLength() (len
L.type) {
return B;
}

func [h IsLength[B] [L : Len]] getLength() (len L.type) {
return B.getFloat();
}

To be used (by implicit cast, which could be removed by compiler if
optimizing):
t := 0.0;
IsLength(t).getLength();
// Which needs t to apply to IsLength.

Or:
func getLength[B][IsLength[B] [L : int]](h L) {
return h;
}

To be used as:
getLength[int](5);

Additional syntax I used – [L : int] is clearly restricting in case
those values inside brackets need to be ordered. Thus, consider this:
func getLength[B][IsLength[B] [IsInt[L]; IsComparable[L]]](h L) {
return h;
}
Where semicolon is somewhat equivalent to “&&”.

[L: int; L: float] impossible constraint can be thus mapped to [L: int
&& float] or [(L: int) && (L: float)]. I think that [L: int, float]
could be considered also.

Now, we want to compare and assign lengths. We need a concept, which
shows that two types have lengths and we can assign them:

type HaveAssignableLengths[A, B] [L] concept {
// This also restricts that lengths of A and B
// must be of same type.
IsLength[A] [L] || HasLength[A] [L];
IsLength[B] [L] || HasLength[B] [L];
IsAssignable([L], [L]);
}

We haven't yet inherited anything from IsLength or HasLength (they
both have getLength defined), thus we need a keyword – Inherit.
type HaveAssignableLengths[A, B] [L] concept {
/.../
Inherit
// This allows the use of A.getLength()
?IsLength[A];
?HasLength[A];
?IsLength[B];
?HasLength[B];
}

Those inherits make it so that we can use getLength method on A and B.
“?” means we only need to inherit if the related constraint allows it
(could use “||” as well, but this might be complex). As Prolog is not
sensitive to order (yes I know Go wants to be, but this is hard in
that case – maybe there is some solution to that, like only imported
packages are used), we can actually check for existence of the
methods, which have to be inherited, before we are doing this inherit
itself – this allows us to use interface-like constraints to check our
own code. Should those inherits conflict (some int-compatible type has
also compatibility with length generic), we go into conflict. We need
if-else clauses to be declared there (they could be done as If
(constraint, iftrue, iffalse) or syntactically); this conflict,
anyway, will be resolved – it's actually possible to resolve it before
template is actually used just by knowing that some type being subtype
of “int” does not necessarily disallow it to have getLength method.
“Not” must clearly be defined so that if we declare functions, which
might both have overlapping constraints, we can say that second one is
used only if constraint of first one does not apply.

Should we use HaveAssignableLengths[A, A], it also checks that
inherits of both parameters are compatible – same method is inherited
for both so that A.getLength() has one clear meaning. This is also
implication of using Prolog-like language.

Now, we can do (suppose we declared setLength, too):
func assignLength[A, B] [HaveAssignableLengths[A, B]](a A, b B) {
a.setLength(b.getLength());
}

Ok, lets define a concrete type now:
type AssignableLengthStruct[A] [L] struct {
HaveAssignableLengths[A, A] [L];
tmp L; // some temporary variable
}

func [AssignableLengthStruct[A]]copyLength(other A) {
}

type aint[A][AssignableLengthStruct[A] [L]] interface {
getLength() L;
setLength(len L);
}

And use it:
var a, b AssignableLengthStruct[float];
// init
a.copyLength(b);

Or:
typedef F AssignableLengthStruct[float];
var a, b F;
// init
a.copyLength(b);


Casting

Concepts like “IsLength” can not be directly used nor anything cast to
them. This speeds up and simplifies compiler. Thus, struct must be
defined as in (I use inclusive or – I don't know if it's easy to
implement that preferably methods from “HasLength” will be included;
also syntax should actually clearly say here if methods from HasLength
should be applied to T or not):

type Lenghted[T][HasLength[T][L] || IsLength[T][L]] struct {
a T;
}

func (z Lenghted[T])getLength() L {
return z.a.getLength();
}

func length(z Lenghted[T]) L {
return z.a.getLength();
}

// Now, the last point – for simplicity of implementation, runtime
should not be able to do casts with generic arguments for unknown
types. Otherwise, inheritance will be messed up. LenghtedInt should be
declared.
type LenghtedInt Lenghted[int];

So lets see what actually happens when this concrete type,
LenghtedInt, is declared.

type Lenghted[T][HasLength[T][L] || IsLength[T][L]] struct {
a T;
}

IsLength will be matched and it's methods declared – type T will have
“getLength” method, because of:

type IsLength[B] [L] concept {
StructIsEqual([B], [L: int]) || StructIsEqual([B], [L: float]) ||
(StructIsEqual([B], [N: Len]) && StructIsEqual([L], float));
getLength() L; // Demands this method to be declared in all cases.
}

func [h IsLength[B] [L : int || float; !L: Len]] getLength() (len
L.type) {
return B;
}

There, from StructIsEqual([B], [L: int]) it's clear that there is a
match – “Lenghted” of type int is possible. Function will be declared
as it's signature matches the case. By all means, this “return B”
should be inlined to caller. In case inlining is not possible, there
should be generated code, which is able to detect real type of L –
this would mean the following real method signature generated from
Prolog code (and that's actually a harder case):

func [h *interface{}] getLength(type int) (len *interface{}) {
switch type {
case 0, 1:
return B;
}
panic(“Compiler bug”);
}

What is done inside is that Prolog's query is used to get a list
containing “int; float”. As I don't know advanced optimizations used
by up-to-date prolog interpretator, I will not discuss details here; I
know anyway that there are many optimizations done to get Prolog
implementations fast and there are libraries available to use. I did
intentionally not use type switch there. For each item in the list, it
will be run over code tree and type-independent ast node is generated.
As the last step, those ast nodes will be matched to throw out
identical ones – which might result in only one case as in this
example. This case might be, then, moved to be function's body. If
anything can be inlined as last step, this will be done by optimizer,
which will go on with this function – several optimizations can be
done with information known, but implementation of templates
themselves could have some internal complexity and it's probably
better to write optimizations there; later, anyway, it's possible to
optimize this template engine a lot – power of Prolog will allow one
to do many queries to understand the exact nature of each case.
Alternatively, separate functions for each case might be generated.

Now, when it goes to concretizing this conception:
func (z Lenghted[T])getLength() L {
return z.a.getLength();
}

type LenghtedInt Lenghted[int];

It's clear that when a.getLength() is called, casts are important.
Also, as there are several possibilities, which one out of several
a.getLength-s will be called, switch must be created here, too.

Another example:

type AreFriends(A, B) concept {
};

Which allows us to declare that:
const AreFriends(MyInt, int) concept;

This was declared in beginning of this text.

We can go further:

type IsStringer(A) concept {
IsStringer(A);
};

As there are different namespaces – Prolog queries can't be used
directly in code – it's allowed to have identical names. Now, it's
possible to declare:

const IsStringer(MyClass) concept;

This simply declares a concept, using concrete class instead of
something abstract. This will result in error is IsStringer with those
parameters can not be applied. This will allow to constraint real
classes to be compatible with some generic constraint. Afterwards,
IsStringer(MyClass) will be true – if this line was not there,
IsStringer(MyClass) would be false, because it's IsStringer(A) and not
IsStringer[A] – it's not allow-all, but it's deny-all.

I think that there should be the following built-in:
const CanCast(OneClass, OtherClass) concept;

This would allow one to make it clear if her classes can be cast to
not – because casting is allowed in general, but for every single
instance, it should be disallowed. CanCast should result in
automatically generated caster. This should be two-way feature that if
you create a method for casting something and give it some specific
name, CanCast will result in true. And the latter case, to be general,
results in another use case:

type CanCast(A, B) concept {
};

type AutoCast[A, B] concept {
CanCast[A, B] || FunctionExists(cast(A) B);
};

This latter supposes method overloading, which is not there. Thus,
function's name should allow variables – in case the related regexps
don't go too complex and slow. I didn't want to complicate this thing
right now with this regexp example, but did better show something
working – if there is a cast function with input of type A and output
of type B, AutoCast will result in true. It will also be true if
CanCast is explicitly defined for some class.

Qtvali

unread,
Dec 20, 2009, 3:02:20 PM12/20/09
to golang-nuts
To make it complete, I would add a notion of "root class" to this
template engine.

Root class is a class, which contains all functions of all packages.
Syntax should allow to ensure, that root class applies to specific
interfaces (or that some packages do) and that those have functions
with specific interfaces. Static reflection, then.

type HaveAssignableLengths[A, B] [L] concept {
/.../
Inherit
// This allows the use of A.getLength()
?IsLength[A];
?HasLength[A];
?IsLength[B];
?HasLength[B];
}

If ?IsLength[A], which inherits methods from IsLength to class [A],
had different way to write it:
IncludeMethods(IsLength, [A]);

And:

type IsLength[B] [L] concept {
StructIsEqual([B], [L: int]) || StructIsEqual([B], [L: float])
||
(StructIsEqual([B], [N: Len]) && StructIsEqual([L], float));
getLength() L; // Demands this method to be declared in all
cases.
}

getLength() L; simply ensures that this method is defined for that
concept.

One would also allow the use:
EnsureMethod(getLength, Variables(), Variables(L));

Then this would be possible to ensure that some class or root class
contains all methods, which are made necessary by some specific
"Prolog program". This would allow to make it so that all classes from
root, which apply to interface x, must also contain few methods, which
are named by some convention.

The following, then, should be allowed:
* To ensure that a struct contains some public field
* To ensure that a struct does not contain some public field
* To ensure that only classes, which have all methods from some
interface, could have any of them
* To ensure that all classes, which have field with some naming
convention, also have getters and setters for that field
* To ensure that all classes, which have "Test" as their name suffix,
are compatible with "UnitTest" interface
etc.

Maybe more practical test would be that all classes, which are
compatible with "deepcopy" interface, have getters and setters
compatible with each other and that also subclasses of all of them
have them - this would need some syntax to actually call all those
methods, thus needing a "for" cycle inside template, which might not
be so simple idea; I don't know yet if this can be achieved, but
currently described syntax does allow checking that they have one
getter and setter with compatible struct in interface. For a list of
getters and setters, inside a call, reflection might be used. Anyway,
giving this prolog-like interface enough power of reflection, it would
be enough to implement any kinds of patterns.


Anyway, one kind of generics is yielding functions I described - those
are generics, which apply to code blocks so that you can make a
generic algorithm supporting some specific kind of block. A bit like
inline classes in Java etc. I think they would be still with use, also
they would be much more powerful if generics like these are applied -
then the structure of such yielding function would be more extensible.
Merging those ideas together, Go would have really powerful generic
support. (I would only ask for operator overloading and properties
then, to have also my classes/variables have all syntactic sugar
possible for primitive expressions in code)

Qtvali

unread,
Dec 20, 2009, 5:03:04 PM12/20/09
to golang-nuts
http://www.ifcomputer.co.jp/en/manuals5.2/uguide/node53.html#SECTION001100000000000000000
This is C interface to Prolog.

I also make it clear, how this kind of syntax converts into Prolog.

For concept declarations:


type Z[T] [L, l] concept {

IsStruct([T]);
HasMethod([T], lessThan() int);
lessThan() (int);
}

This should basically create the rule (I don't want to mess with more
complex structures here, show basic conversions, thus expressing only
one return value):
IsConcept(
allowed_z(T, L) :-
IsStruct(T);
HasMethod(T, less_than, int).
// Similar things are fed by parser
HasMethod(concept_z, less_than, int).

Now, an extension to this concept:
type StringerT[T] concept {
IsT([T]);
HasMethod([T], String, string);
?[T];
}

allowed_t(T) :-
allowed_z(T).
HasMethod(concept_stringer_t, X, Y) :-
HasMethod(concept_t, X, Y).

Now, one can register a structure R having this method:
IsStruct(struct_r);
HasMethod(struct_r, less_than, int).
HasMethod(struct_r, string, int).


And ask:
(by typing: type t StringerT[R];)
allowed_z(struct_r).
Getting: yes.
HasTemplateMethod(type_t, X, Y) :-
HasMethod(concept_stringer_t, X, Y).
HasMethod(type_t, X, Y) :-
HasMethod(struct_r, X, Y).

Now, to actually register those methods, compiler will ask:
HasTemplateMethod(type_t, X, Y);
And get:
X=lessThan, Y=int
X=String, Y=string

So that it can register those methods there.

Now, when compiler asks:
HasMethod(concept_z, X, Y);

It will get a list of all methods concept_z declares – with also all
input types if you define some rules there. Prolog can figure out the
“implementation details” - if it knows that one possible input is
subset of another possible input, it can give only the general case,
for example.

It can list all implementations, which are not subsets of another
implementation: one can say that choose the most generic like that:
generic(a, b). // A is more general than b
generic(b, c).
generic(c, d).
satisfies(c, 1).
satisfies(b, X) :- satisfies(c, X).
satisfies(n, 1).
// General rule for generics
generic(X, Y) :- generic(X, Z); generic(Z, Y).
// Anything is generic of itself
generic(X, X).
// X is result here, returns most general class
// satisfying given rule to be compiled with
// generic type.
topmost_satisfying(X, S) :-
satisfies(X, S);
not(parent_satisfies, R, S).
parent_satisfies(X, S) :-
generic(Z, X);
satisfies(Z, S).

Now, to list the topmost generics satisfying 1 (those, which have no
parent satisfying 1):
topmost_satisfying(X, 1)
X=b
X=n
No

Which is the list of possible input variations a generic class must
implement.

So, getting this prolog case as a beginning and making all syntax as
one-to-one mapping to prolog, specifically checking that all Prologs
functionality is included or does not make sense; in case it does not
make sense that it could be included without changing syntax rules,
one can be sure that the template library actually implements a turing-
complete version of all functionality programmer might need. C++
template library, it's told to me, is turing-complete.

Qtvali

unread,
Dec 21, 2009, 8:01:15 AM12/21/09
to golang-nuts
From another proposal -
http://groups.google.com/group/golang-nuts/browse_thread/thread/801e9412de89baab/3f94208706aad7a4
- I think that the good part is that native types should support some
interfaces to check if they implement an operator. I think that
instead of "LessThan", they should implement "op<". Also, tables and
arrays should implement things like "getIterator" and "getElement
(type)" etc. I think that this should be compatible with operator
overloading.

To moreover show what this Prolog-like try would give:

type IntegerTree[Node] [T] concept {
InterfaceApplies([Node], getElement, [T]);
InterfaceApplies([Node], getIter, chan [T]);
IntegerTree[T] || IsInt([T]);
}

This semantics would make it sure that IntegerTree has getElement,
which returns another IntegerTree or an integer. Thus, such a short
interface would be enough to check that there is a tree, which has
integers in it's innermost nodes - levels themselves might be of
different types. This allows an algorithm needing such tree to be
relatively safe. It also checks that iterator and element getter are
of same type without restricting the type itself and without needing
it to be separately specified, which type it is. This is roughly
similar to OCaml's way to declare variable types - but to use OCaml's
type definitions is somewhat hard, so it's reasonable to apply that
only to generics (in OCaml, everything is basically generic and that
makes programs actually more complicated to understand in my
experience).

Having [Node] and [T] separated there gives that when type
declarations need only [Node] to be specified, other concepts relying
on this concept can use also it's feature to figure out the actual
value of [T]. In Prolog, this [T] is simply replace with X or Y etc.
when it's not used by caller. It's easy to write a restriction that
[T] must give concrete type for any concept used in type definition
etc.

And, last but not least - as this is Prolog-compatible, it's not hard
to include it in Eclipse, for example, to give this automated error
correction behavior (like suggesting one to add few methods so that
constraint is satisfied).

atomly

unread,
Dec 21, 2009, 2:54:19 PM12/21/09
to Qtvali, golang-nuts
Didn't you start a blog?

--
:: atomly ::

[ ato...@atomly.com : www.atomly.com  : http://blog.atomly.com/ ...
[ atomiq records : new york city : +1.917.442.9450 ...
[ e-mail atomly-new...@atomly.com for atomly info and updates ...
Reply all
Reply to author
Forward
0 new messages