> Re: Object System
>
> Observation: this means that there will be three completely different
> syntactic ways to define name mappings:
>
> define foo { } // global name
> \x [ ... ] // local name
> ... "foo" _def_ // member name
These are all part of different dialects of Cat:
- define is part of "Little Cat", or the core cat language.
- \x is part of "Labmda Cat", the core language a lambda expressions
- "foo" _def_ is part of object Cat, the core language with OOP support
> But on to the interesting part, the interaction between types,
> compile-time and run-time. If I understand correctly, meta_string and
> _xyz_ behave somewhat like symbols and syntactical forms in Scheme?
Yes.
> On the other hand the meta_string is a string, not something else, so
> it behaves somewhat like a literal in C++.
Yes. Actually this is closer to the reality.
> Now in C++ it is possible to perform calculations on "literal int",
> i.e. the "literal" behaves like a qualifier that, unfortunately, is
> not explicitly available in the language, one that remains until a
> calculation becomes "tainted" by a non-literal int.
This is a rather informal approach to explaining it.
The reality of C++ is almost the same as in Cat: a literal int is not
a value, but rather a meta-value, something that exists alongside
types.
Using a kind system literals would have their own syntactic category
which was similar to types, and separate from terms in the language.
> If I throw in the idea of performing all purely pure computations at
> compile time, why do you need meta_string?
Because it makes explicit the difference between the two. The type
system has to do that one way or another. If you create ad-hoc rules
in the language definition ala C++, you are doing what I am doing with
Cat, but simply in an informal way.
> If, at the moment when the
> _def_ is handled by the compiler, all preceeding pure expressions
> have already been evaluated, then it is sufficient to check that
> there is a regular string, and use that...
What do you mean by regular string? You seem to be assuming that all
pure expressions will be pre-evaluated when resolving "_def_". This
might not always be the case. Using meta-values allows you to use the
type-system to assure that the precise meta-string type is known (and
will never change).
> Another exampe where the pure/impure distinction plus the
> immutability of all data gives rise to a cool simplification of
> things ;-)
I'm sorry but I don't yet see the simplification.
> Does this work as I imagine?
I don't know yet. However I suspect we are talking about the same
thing more or less. I am just proposing a type system solution to
issues that are traditionally resolved in compilers using ad-hoc
means. From a programmer's prespective the difference would be almost
impercetible. However in all likelihood using the type system, would
probably yield more consistent and predictable language behaviour.
Christopher
[...]
> > If, at the moment when the
> > _def_ is handled by the compiler, all preceeding pure expressions
> > have already been evaluated, then it is sufficient to check that
> > there is a regular string, and use that...
>
> What do you mean by regular string? You seem to be assuming that all
> pure expressions will be pre-evaluated when resolving "_def_". This
> might not always be the case. Using meta-values allows you to use the
> type-system to assure that the precise meta-string type is known (and
> will never change).
Yes, I assume that everything pure is pre-evaluated, so that when the
"syntactic form" _def_ is compiled, it can check whether its argument
is a normal string object.
> > Does this work as I imagine?
>
> I don't know yet. However I suspect we are talking about the same
> thing more or less. I am just proposing a type system solution to
> issues that are traditionally resolved in compilers using ad-hoc
> means. From a programmer's prespective the difference would be almost
> impercetible. However in all likelihood using the type system, would
> probably yield more consistent and predictable language behaviour.
To me it seems that the type distinction is not necessary for the
literals; every pure expression that evaluates, at compile time, to a
string, can be used instead. I do not want to change the remainder of
the handling, i.e. the _def_ "form" actually creates a new, extended
type.
While it might be desirable to forbid the freedom that my approach
brings for reasons of code clarity, I don't (yet) see what exactly you
gain by giving the literals a special status...
(One of the simplifications I was referring to is that the whole
constant subexpression elimination stuff, that is probably rather
involved in dynamically typed scripting languages, comes out nearly
for free once you know that you can evaluate all pure functions on
known values at compile time; my "reflex" is to try to apply this
principle to as many situations as possible, thereby eliminating
special cases.)
Ciao, Colin
That assumption isn't always feasible. Pure computations can be
intractable or undecidable. As a result, you probably don't want to
perform all pre-evaluation at compile-time.
Also can you pre-evaluate all-expressions before determining all
types? You might run into a circular impossibility with mutually
recursive definitions.
> > > Does this work as I imagine?
> >
> > I don't know yet. However I suspect we are talking about the same
> > thing more or less. I am just proposing a type system solution to
> > issues that are traditionally resolved in compilers using ad-hoc
> > means. From a programmer's prespective the difference would be almost
> > impercetible. However in all likelihood using the type system, would
> > probably yield more consistent and predictable language behaviour.
>
> To me it seems that the type distinction is not necessary for the
> literals; every pure expression that evaluates, at compile time, to a
> string, can be used instead. I do not want to change the remainder of
> the handling, i.e. the _def_ "form" actually creates a new, extended
> type.
>
> While it might be desirable to forbid the freedom that my approach
> brings for reasons of code clarity, I don't (yet) see what exactly you
> gain by giving the literals a special status...
We are still looking at two sides of the same coin. What I am
proposing is essentially a formalization of the notion of
pre-evaluation. I am sorry that I failed to be more explicit but
certain value expressions (like +, -, *, etc.) can be trivially
extended to meta-values. This means that "12 13 +" can have the type
"25".
What I can do with the type system approach is prevent pre-evaluation
from continuing on forever, by limiting the number of operations
available to meta-values. You want to try to minimize the possibility
of infinite loops in your type-checker, and minimize compile-times.
> (One of the simplifications I was referring to is that the whole
> constant subexpression elimination stuff, that is probably rather
> involved in dynamically typed scripting languages, comes out nearly
> for free once you know that you can evaluate all pure functions on
> known values at compile time; my "reflex" is to try to apply this
> principle to as many situations as possible, thereby eliminating
> special cases.)
We are trading one thing for another here: pre-evaluation stage for
type checking. I am proposing is embedding the pre-evaluation stage
into the type-checker directly.
I appreciate your insights into the issues, and your willingenss to
talk this through Colin. It helps me out a lot.
- Christopher
Unless your language your language is strongly normalizing, in which
case pure computation is definitely bounded. If you can prove
termination, via sized types, strongly normalizing terms or other
technique, then it's not an issue.
Sandro
> > Yes, I assume that everything pure is pre-evaluated, so that when the
> > "syntactic form" _def_ is compiled, it can check whether its argument
> > is a normal string object.
>
> That assumption isn't always feasible. Pure computations can be
> intractable or undecidable. As a result, you probably don't want to
> perform all pre-evaluation at compile-time.
Hm.
> Also can you pre-evaluate all-expressions before determining all
> types? You might run into a circular impossibility with mutually
> recursive definitions.
Hm.
I'm not far enough with my implementation/thinking to have stumbled
over this one ... while I wouldn't let myself be deterred by infinite
loops and similar (better to have the compiler hang rather than the
program during execution), I do see that mutual recursive types could
be a bother. If/when I get that far I'll simply try it out, after all,
the driving force for my suggestion was to simplify the
implementation, so it should be really simple [to try it out] to see
what happens. At least it will be very instructive for me ;-)
> What I can do with the type system approach is prevent pre-evaluation
> from continuing on forever, by limiting the number of operations
> available to meta-values. You want to try to minimize the possibility
> of infinite loops in your type-checker, and minimize compile-times.
Does your approach prevent infinite loops, or merely defer them?
(Have to dig here ... must admit that I can't immediately grok all
implications of what I am suggesting.)
> I appreciate your insights into the issues, and your willingenss to
> talk this through Colin. It helps me out a lot.
And thank you for your patience, these exchanges with you are helping
me sustain a good learning curve that would be much flatter
otherwise...
Colin
A lot of people say this, but the truth is, how could you possibly
know the compiler has hung and isn't doing actual work? A terminating
type checker is undeniably a good thing.
Sandro
Full agreement here.
I should have made clear that I was referring to infinite loops in the
actual computation, not on the type level, e.g. when your code
contains something equivalent to (but not as obvious as) "while (1);"
then I would actually prefer the compiler to hang, rather than the
compiled program (which might be a lonely server that suddenly stops
responding).
Regards, Colin
I'm unfamiliar with the term "strongly normalizing". Are you saying
that a strongly normalizing language always halts? In which case it
must not be Turing complete. I am staying within the bounds of Turing
complete languages (though the type system itself doesn't have to be
Turing-complete).
> If you can prove
> termination, via sized types, strongly normalizing terms or other
> technique, then it's not an issue.
Languages in which this is always possible are particular limited.
However you can use such techniques to identify pure expressions which
can be evaluated at compile-time. In which case you will have
introduced a new set of semantic rules for such values. They are no
longer simply "pure expressions" as Colin was suggesting but
"terminating pure expressions". This would indicate then that a new
syntactic category would be needed to express this formally. Then you
are essentially left with a meta-value.
Traditionally, in C++, the formalization was eschewed in favour of a
set of rules in the the specification expressed in natural language. I
am just trying to add some rigour.
So allow me to reiterate what I am saying: meta-values are a syntactic
category for the set of values that can be computed at compile-time.
> Sandro
Thanks for jumping in Sandro, it is good to get multiple viewpoints on
the subject.
Christopher
Some programs (e.g. web servers) are not expected to terminate under
natural conditions. If I understand you correctly you are suggesting
they shouldn't compile neither?
I don't mind type-checkers that are Turing complete (e.g. dependent
type systems), and can get caught in infinite loops in rare
circumstances, but this should be due to some mistake in the type
system, instead of default behaviour in the face of a perfectly valid
non-halting program.
- Christopher
It depends on the operations made available to meta-values. I would
probably allow them ironically enough, but only by allowing explicit
recursion or iteration over meta-values (meta-loops). This is similar
to the C++ template system, where you can force the compiler into an
infinite loop, if you try very hard. What I don't want is infinite
loops at the program level to always yield infinite loops in the type
checker.
- Christopher
Yes. Looping or recursion are not expressible in such languages,
except by 'self passing', which is to say, a procedure accepts a
reference to itself which it calls to recurse.
> In which case it
> must not be Turing complete. I am staying within the bounds of Turing
> complete languages (though the type system itself doesn't have to be
> Turing-complete).
That's exactly it. The only addition to a strongly normalizing
language to permit general recursion is some kind of fixpoint/loop
construct, which passes a caller to itself ('tying the knot') or
merely loops on some predicate.
As you mention in a subsequent e-mail, many programs like servers and
operating systems do not necessarily terminate, but all such programs
can be reduced to a strongly normalizing program which is only
perpetuated by the above loop term based on a predicate.
If you detect the loop construct in a pure expression, then you don't
evaluate it. If you don't detect it, then you can evaluate it safely
at compile-time knowing it will terminate. I'm not sure that this
would work for Cat given its design, but it's a thought. :-)
> > If you can prove
> > termination, via sized types, strongly normalizing terms or other
> > technique, then it's not an issue.
>
> Languages in which this is always possible are particular limited.
We're dealing here only with compile-time evaluation though, where the
degree of evaluation is already limited. I was merely suggesting that
a possible solution is restricting the terms to the set which are
known to terminate, essentially a strongly normalizing core language,
and only evaluating that at compile-time.
> However you can use such techniques to identify pure expressions which
> can be evaluated at compile-time. In which case you will have
> introduced a new set of semantic rules for such values. They are no
> longer simply "pure expressions" as Colin was suggesting but
> "terminating pure expressions". This would indicate then that a new
> syntactic category would be needed to express this formally. Then you
> are essentially left with a meta-value.
>
> Traditionally, in C++, the formalization was eschewed in favour of a
> set of rules in the the specification expressed in natural language. I
> am just trying to add some rigour.
>
> So allow me to reiterate what I am saying: meta-values are a syntactic
> category for the set of values that can be computed at compile-time.
I haven't followed this discussion closely, so I'm not too familiar
with your proposal. Just thought I'd toss in my 1 cent on the
termination argument.
Sandro
If the infinite loop is pure it needn't compile, like the example
"while (1);".
(Any server would (or should) have some impure functions in the
infinite loop that would prevent their complete evaluation at compile
time, after all, system calls, including all socket operations, are
impure.)
> I don't mind type-checkers that are Turing complete (e.g. dependent
> type systems), and can get caught in infinite loops in rare
> circumstances, but this should be due to some mistake in the type
> system, instead of default behaviour in the face of a perfectly valid
> non-halting program.
Perhaps the compiler could then have the courtesy of saying something
like "type inference halted after 10^10 iterations, do you want to
continue?" ;-)
Colin
Good point. Ignoring my flawed example, I still stand by my belief
that full evaluation of all pure expressions at compile-time is not a
practical idea.
> > I don't mind type-checkers that are Turing complete (e.g. dependent
> > type systems), and can get caught in infinite loops in rare
> > circumstances, but this should be due to some mistake in the type
> > system, instead of default behaviour in the face of a perfectly valid
> > non-halting program.
>
> Perhaps the compiler could then have the courtesy of saying something
> like "type inference halted after 10^10 iterations, do you want to
> continue?" ;-)
For sure that would be great! But if that happens too frequently
during compilation you may run into some unhappy customers.
Cheers,
Christopher
I had, on and off, been thinking about another mechanism that would
allow retro-fitting a name to a stack location, i.e. anywhere in the
"istream" you can say "from now on, the value found at index 10 on the
stack is to be known under the name 'foo'"; any later occurrence of
foo would simply duplicate the value on the top of the stack.
OTOH I was wondering about not using top-level for instructions, i.e.
a source consists of defines, and once everything is defined I (or the
compiler) only call "main". In such a style I can use lambdas to give
names to all parameters, effectively giving everything a name...
The problem with the Env object on top of the stack, and the required
rewriting, from the "Monad" post, made me think of this again.
Ciao, Colin
I am not sure whether you mean "10th item from the top of the stack",
or "10th item from the bottom". In the former, you would lose
referential transparency. Not very desirable. In the latter you be
extending the type to have a set of at least 10 arguments, which as
far as I can tell is fine, just a bit strenuous for the brain, but
feasible.
> OTOH I was wondering about not using top-level for instructions, i.e.
> a source consists of defines, and once everything is defined I (or the
> compiler) only call "main". In such a style I can use lambdas to give
> names to all parameters, effectively giving everything a name...
This sounds perfectly reasonable to me.
> The problem with the Env object on top of the stack, and the required
> rewriting, from the "Monad" post, made me think of this again.
So far I've pretty much decided to add global variables to Cat. Any
explicit usage of them will be consider "effectful" thus the arrow
will switch from "->" to "~>". However the implications of the "Monad"
post (which turns out not to be a monad after all) is that "~>" means
not that there is a side-effect, but instead the type has simply
undergone a monadic-like transformation.
So:
E.x + : (int ~> int)
will really mean:
E.x + : (int E -> int E)
Any thoughts?
- Christopher
>From the top of the stack, and only "locally", i.e. the name is only
valid as long as the compiler can prove that the stack position that
it was initially "bound" to has not changed; i.e. using it must be
equivalent into a sequence of swap/bury/rotate etc. operations
including one dup.
>From the bottom would effectively introduce global variables (again),
I'd do that in a different way [by making the current stack accessible
to "define" in some way]; i.e. global variables would be always be
constant (bound to the same objects, objects are const anyhow).
> > OTOH I was wondering about not using top-level for instructions, i.e.
> > a source consists of defines, and once everything is defined I (or the
> > compiler) only call "main". In such a style I can use lambdas to give
> > names to all parameters, effectively giving everything a name...
>
> This sounds perfectly reasonable to me.
>
> > The problem with the Env object on top of the stack, and the required
> > rewriting, from the "Monad" post, made me think of this again.
>
> So far I've pretty much decided to add global variables to Cat. Any
> explicit usage of them will be consider "effectful" thus the arrow
> will switch from "->" to "~>". However the implications of the "Monad"
> post (which turns out not to be a monad after all) is that "~>" means
> not that there is a side-effect, but instead the type has simply
> undergone a monadic-like transformation.
So you want global variables to be mutable? (I.e. they can be
redefined to point to different objects at run-time?)
> So:
>
> E.x + : (int ~> int)
>
> will really mean:
>
> E.x + : (int E -> int E)
>
> Any thoughts?
Not yet ... I'm rather new to functional programming, and I've never
used Haskell, and I haven't really understood what monads are; I've
tried by reading some articles, but what I've found are mainly
examples on how to use them. I would like a concise definition of what
a monad actually _is_.
What I also haven't quite understood yet is why, in the example of the
impure access to environment variables, you want to "fight" the
tainting and somehow "hide" it in a monad (or similar). If your
canoncial instruction stream does an impure read, so be it...
The analogy of an instruction stream is fun, when you think of an out-
of-order CPU you also can re-order some instructions, but not all
etc...
Ciao, Colin