Expression macros modifying global program and compiler state

0 views
Skip to first unread message

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 20, 2008, 10:01:15 AM12/20/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
Hi!

We would like to hear about your experiences with expression macros
(i.e. not those declared in attributes, but those used inside methods'
bodies), which modify program's hierarchy (e.g. add new methods or
classes by Define compiler's API) or change the state of compiler in
other way.

Historically those were the hardest to support in the whole macro
system, because in the stage of typing method's body (when expression
macros are expanded) many of compiler's data-structures are in more or
less finalized state. Now we would like to hear from you if those
macros are really useful and if it is possible to change them to some
attribute level macros + "pure" expression macros.

Other difficulties with those expression macros:
- Since compiler sometimes runs in "error reporting mode" where it
types the same method for the second time in more accurate error
gathering mode, macros can be expanded several times, which means that
they should cause side effects only once or check that what they need
to do is not yet done (i.e. they should be idempotent). This can be
also achieved by using IsMainPass property of manager on the affecting
code, but developers usually don't bother and can get strange error
messages if macro is run several times.
- Folks working on Visual Studio IDE integration would like to run
typing of methods in parallel on several threads. However many of the
compiler's data-structures / APIs are not thread-safe and we don't
want to force developers to use locking before interacting with
compiler (this is similar to IsMainPass, but much much worse).

Some ideological arguments:
- Adding fields / methods / classes as a side effect of placing some
macro in the depths of some method is rather unpure and makes whole
program harder to understand.
- Separation (at least mental) to high level definition of hierarchy,
interface, structure, etc. of program and to actual method's bodies
where black work is done was always our goal, i.e. with Nemerle's
syntax we promote using OO-programming with classes, interfaces like
in popular languages, but inside method's bodies we promote functional
programming (local functions, type inference, etc.). If contents of
latter is seriously affecting the contents of former, then it's a sign
of bad things.

Nice uses of exp. macros modifying program:
- I have a little macro in my project, which automatically adds field
containing Logger object for current class if I use 'log(INFO, ...)'
anywhere in it. This saves me one line of code per class (the class
attribute macro, which would do the same, like [AddLogger] class Foo
{ ... })
- Once there was a library for units of measure (see
http://nemerle.org/forum.old/viewtopic.php?t=240, note this is old and
won't work with current compiler). It generated new kind of units (as
form of classes / interfaces) on the fly, e.g. if you divide Length by
Time in code, the result get new automatically generated type, which
is something like I_Length1_Time-1. This is nice, however IMHO this
implicit generation can and should be turned into explicit declaration
by macro attributes.
- others? We would like to hear from you

In essence, if we do not find compelling examples or all can have
clean and easy substitution with attribute macros, then we would like
to forbid mutating compiler state in expression macros. You would
still be able to do whatever you want with e.g. files, data-bases, but
keep compiler and program's structure unchanged.

--
Kamil Skalski
http://kamil-skalski.pl

Igor Tkachev

unread,
Dec 21, 2008, 1:19:13 PM12/21/08
to Ķaмȋļ ๏ Şκaļşκȋ, nemer...@googlegroups.com, nemer...@googlegroups.com
Hello Ķaмȋļ,

> - others? We would like to hear from you

1. Anonymous types if they are implemented as a macro.

2. Compile-time Duck Typing:

def duckObject = duck(IMyInterface, myObject [, myObject2...]);

It requires new class to implement IMyInterface.

3. Caching by parameter values

def result = cache(System.Math.Exp, value);

This macro requires static private field.

--
Best regards,
Igor mailto:i...@rsdn.ru

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 21, 2008, 3:42:22 PM12/21/08
to Igor Tkachev, nemer...@googlegroups.com, nemer...@googlegroups.com
2008/12/21 Igor Tkachev <i...@rsdn.ru>:

> Hello Ķaмȋļ,
>
>> - others? We would like to hear from you
>
> 1. Anonymous types if they are implemented as a macro.

Ok, we usually use tuples instead of anonymous types. But usually they
are more readable than tuples, so can be useful. I don't know how to
do this without expression macros if you don't want to explicitly
declare those types... on the other hand Record macro greatly
simplifies writing small classes of that kind:

[Record] class NameValue {
Name : string; Value : object;
}


>
> 2. Compile-time Duck Typing:
>
> def duckObject = duck(IMyInterface, myObject [, myObject2...]);
>
> It requires new class to implement IMyInterface.
>

Is it strongly typed or reflection / dynamic invoke based? Anyway, in
both cases I think you can easily go for something like:

[DuckyInterface(IMyInterface, TypeOfMyObject)]

---->

Duck(myObject : TypeOfMyObject) : IMyInterface {
Duck_TypeOfMyObject_to_IMyInterface(myObject)
}
class Duck_TypeOfMyObject_to_IMyInterface : IMyInterface {
Foo(x : int) : int {
obj.Foo(x)
}
obj : TypeOfMyObject
}

The difference is that you need to specify possible "duck conversions"
upfront, which might be kind of irritation if you prototype something,
but in stable, maintained code is rather better idea.

> 3. Caching by parameter values
>
> def result = cache(System.Math.Exp, value);
>
> This macro requires static private field.

I see, so your examples show, that it is generally useful to not
define any additional fields / classes if you can avoid it even if
they would be quite small. This is similar to my "log" macros, so I
need to add [Logger] attribute to each class, while you would need to
define something like:

[HasCacheFor(System.Math.Exp, System...., )]

Thanks for input.

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 21, 2008, 3:56:02 PM12/21/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
2008/12/20 Ķaмȋļ ๏ Şκaļşκȋ <kamil....@gmail.com>:

We could consider some other approaches here:
- require expression macros modifying compiler's state to be annotated
with some special attribute, like [MutatesCompiler]. This would cause
complier to lock itself for the time of this macro's execution to
avoid concurrent mutation of compiler by two macros. If we detect at
compiler's execution time that you try to mutate it, but don't have
this attribute defined, then we could print some error, though this
might not be easy to implement correctly.
- for the problem of error reporting mode maybe compiler should
automatically ignore any Define calls if it is in this mode?

Dmitry Ivankov

unread,
Dec 21, 2008, 5:42:05 PM12/21/08
to nemer...@googlegroups.com
What is the main problem we want to resolve?
If it's just thread safety then it can be done by making either compiler proxy for macros or making compiler threadsafe, maybe there are other ways, but all are simpler then new attributes and complex logic.
If its purifying expression macros then could be a good idea to provide some private sandboxes for macros, where they can store their aux classes (like translation stuff, regexp objects of regexp macro), need to think on making it flexible and easy to use (maybe it's just synchronized access to sandbox by it's name).
I think it's a good idea to force using top level macros for modifying nonsandbox hierarchy, I'm not quite sure that it's always sufficent though.
Making threadable compiler won't be too simple, currently it's almost impossible
- it shouldn't depend much on how threads are sheduled and run
- macros can interact for sure
- we can get some weird locking problems like deadlock by macros or smth else
I think if we want it fast then the only way is to make compiler threadsafe by some locks and a big warning about possible unsafety. Unique names generaton is the first thing that comes in mind to require locks.
If we want pure parallel compiler with powerful macros inside it's another story, which needs much structurization and formalization for macros and of course refactoring of the compiler itself.
So which is it, threadable compiler to utilize many cpu cores and solving arising problems, purification and review of macrosystem, working on error reporting mode or maybe a parallel compile of methods in part of/in special cases?
All are worthy but the main direction is necessary.
First two are very interesting, second even shouldn't be too big task (in terms of programming, not thinking of course).
Third just have to be done, that is collect bugs, wishes and resolve them.
Forth should be more specific, what exactly is the main reason to run parallel method typing at the moment?
As for impure expression macros i think we should collect possible kinds of them and put to snippets or smth like that in svn to have them in one place :)

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 21, 2008, 6:22:04 PM12/21/08
to nemer...@googlegroups.com
2008/12/21 Dmitry Ivankov <divan...@gmail.com>:

Ok, I agree. Making compiler thread-safe would be hard and I would
rather encourage anybody trying to do this to first rewrite most of
the compiler's APIs and object hierarchy to something better. Possibly
after that thread-safety problem could be manageable.

> If its purifying expression macros then could be a good idea to provide some
> private sandboxes for macros, where they can store their aux classes (like
> translation stuff, regexp objects of regexp macro), need to think on making
> it flexible and easy to use (maybe it's just synchronized access to sandbox
> by it's name).

Yeah, this would need to be merged into program in synchronized way
later on and it would prevent you to actually see the program
immediately after your additions... It would also change the way you
interact with compiler for doing it and leave the old code unsafe.

> I think it's a good idea to force using top level macros for modifying
> nonsandbox hierarchy, I'm not quite sure that it's always sufficent though.
> Making threadable compiler won't be too simple, currently it's almost
> impossible
> - it shouldn't depend much on how threads are sheduled and run
> - macros can interact for sure
> - we can get some weird locking problems like deadlock by macros or smth
> else
> I think if we want it fast then the only way is to make compiler threadsafe
> by some locks and a big warning about possible unsafety. Unique names
> generaton is the first thing that comes in mind to require locks.

Hmm, we could take methods used to mutate compiler (shouldn't be much
more than TypeBuilder.Define and GlobalEnv.Define) and make them
thread-safe / lock if they know we are running in method typing stage.
But anyway, if we want compiler to run multiple threads there are many
places to fix, like names generation you mentioned, expanding the
external type stubs, etc. Vlad was asking me for those places, but I
guess the only way to enumerate them is to deeply look into code
and/or try running in parallel and leard the hard way.

> If we want pure parallel compiler with powerful macros inside it's another
> story, which needs much structurization and formalization for macros and of
> course refactoring of the compiler itself.
> So which is it, threadable compiler to utilize many cpu cores and solving
> arising problems, purification and review of macrosystem, working on error
> reporting mode or maybe a parallel compile of methods in part of/in special
> cases?
> All are worthy but the main direction is necessary.

If you ask me, I would like to see purification and review of
macrosystem (or more widely compiler's API) as the first thing. But
since I don't really have time to do it myself, this question is to
Vlad, who is nowdays pushing development forward.

> First two are very interesting, second even shouldn't be too big task (in
> terms of programming, not thinking of course).
> Third just have to be done, that is collect bugs, wishes and resolve them.
> Forth should be more specific, what exactly is the main reason to run
> parallel method typing at the moment?
> As for impure expression macros i think we should collect possible kinds of
> them and put to snippets or smth like that in svn to have them in one place
> :)

The trigger for this thread was Vlad's idea to type methods in
parallel when in IDE mode. This has also related problem, that IDE
runs typing of the same method several times, just like error
reporting mode - the same property IsMainPass can be used to detect it
and ensure "some" idempotence to the macro.

Other problems are rather something to live with, at least for me, like:
- mutating compiler / program in expression macros has much more
limitations and most probably much more dangerous (even causing some
strange code to be generated) bugs than doing it during attribute
macros pass (at least based on the history of existing bugs)
- requiring stuff like IsMainPass is kind of ugly, though we should
probably explicitly encourage / request macro writers to make their
macros safe for multiple runs
- slightly non-declarative style caused by those macros... from the
examples we can see some better or worse, but seems like there are
some nice uses.

VladD2

unread,
Dec 22, 2008, 4:31:06 AM12/22/08
to nemer...@googlegroups.com
2008/12/22 Ķaмȋļ ๏ Şκaļşκȋ <kamil....@gmail.com>:

> Ok, I agree. Making compiler thread-safe would be hard and I would
> rather encourage anybody trying to do this to first rewrite most of
> the compiler's APIs and object hierarchy to something better. Possibly
> after that thread-safety problem could be manageable.

+1

> If you ask me, I would like to see purification and review of
> macrosystem (or more widely compiler's API) as the first thing. But
> since I don't really have time to do it myself, this question is to
> Vlad, who is nowdays pushing development forward.

I agree. The First step should become refactoring of the compiler. And
not only API, but also an internal code.

> The trigger for this thread was Vlad's idea to type methods in
> parallel when in IDE mode.

No, no, no... Not only IDE mode. And compiler mode too. The compiler
project have size about 1.2 Mbytes but it compile time is about 2
minutes (on 2.5 Ghz Core 2 Duo).

> This has also related problem, that IDE
> runs typing of the same method several times, just like error
> reporting mode - the same property IsMainPass can be used to detect it
> and ensure "some" idempotence to the macro.

+1

>
> Other problems are rather something to live with, at least for me, like:
> - mutating compiler / program in expression macros has much more
> limitations and most probably much more dangerous (even causing some
> strange code to be generated) bugs than doing it during attribute
> macros pass (at least based on the history of existing bugs)
> - requiring stuff like IsMainPass is kind of ugly, though we should
> probably explicitly encourage / request macro writers to make their
> macros safe for multiple runs
> - slightly non-declarative style caused by those macros... from the
> examples we can see some better or worse, but seems like there are
> some nice uses.

+1

Igor Tkachev

unread,
Dec 22, 2008, 10:10:59 AM12/22/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
Hello Kaм?l,

I think there is another bigger problem. Almost any more or less
complicated macro requires type information. But the way to obtain it,
compiling PExpr subtree -> get TExpr -> get type, seems to be added
later after PExpr and TExpr were designed. I guess that design
considered only PExpr to be involved in macro to develop simple things
such as if/else. But life is life.

What we are doing right now is closing some gaps. But instead we
should redesign the macro system to meet user expectations which are
far bigger than just if/else macro.

> We could consider some other approaches here:
> - require expression macros modifying compiler's state to be annotated
> with some special attribute, like [MutatesCompiler]. This would cause
> complier to lock itself for the time of this macro's execution to
> avoid concurrent mutation of compiler by two macros. If we detect at
> compiler's execution time that you try to mutate it, but don't have
> this attribute defined, then we could print some error, though this
> might not be easy to implement correctly.
> - for the problem of error reporting mode maybe compiler should
> automatically ignore any Define calls if it is in this mode?

--

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 22, 2008, 10:40:13 AM12/22/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
2008/12/22 Igor Tkachev <igor.t...@gmail.com>:

>
> Hello Kaм?l,
>
> I think there is another bigger problem. Almost any more or less
> complicated macro requires type information. But the way to obtain it,
> compiling PExpr subtree -> get TExpr -> get type, seems to be added
> later after PExpr and TExpr were designed. I guess that design
> considered only PExpr to be involved in macro to develop simple things
> such as if/else. But life is life.

This is exactly how it happened. Dealing with TExpr in macros was
always hard, or rather it was of the same difficulty as doing so
inside compiler / typing engine, which is not what average user should
be forced to do. But essentially I don't know if this can be improved,
maybe if we focus on some concrete examples and just try to create
methods in compiler to simplify them.

For example http://nemerle.org/svn/nemerle/trunk/macros/core.n in
'foreach' macro there is single call to typer.TypeExpr(PExpr) and
then we just match on the MType... which is maybe not the simplest
datastructure, but it rather couldn't be simpler. Also there are uses
of SuperType and DelayMacro. Second one might be confusing, but it
makes macro work better with our type inference algorithm.

>
> What we are doing right now is closing some gaps. But instead we
> should redesign the macro system to meet user expectations which are
> far bigger than just if/else macro.
>

Probably defining those expectations (and filtering out those *really*
impossible) would help, especially in any refactoring of compiler.

>> We could consider some other approaches here:
>> - require expression macros modifying compiler's state to be annotated
>> with some special attribute, like [MutatesCompiler]. This would cause
>> complier to lock itself for the time of this macro's execution to
>> avoid concurrent mutation of compiler by two macros. If we detect at
>> compiler's execution time that you try to mutate it, but don't have
>> this attribute defined, then we could print some error, though this
>> might not be easy to implement correctly.
>> - for the problem of error reporting mode maybe compiler should
>> automatically ignore any Define calls if it is in this mode?
>
> --
> Best regards,
> Igor
> mailto:i...@rsdn.ru
>
>
> >
>

--
Kamil Skalski
http://kamil-skalski.pl

Igor Tkachev

unread,
Dec 22, 2008, 11:44:16 AM12/22/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
Hello Kaм?l,

>> I think there is another bigger problem. Almost any more or less
>> complicated macro requires type information. But the way to obtain it,
>> compiling PExpr subtree -> get TExpr -> get type, seems to be added
>> later after PExpr and TExpr were designed. I guess that design
>> considered only PExpr to be involved in macro to develop simple things
>> such as if/else. But life is life.

> This is exactly how it happened. Dealing with TExpr in macros was
> always hard, or rather it was of the same difficulty as doing so
> inside compiler / typing engine, which is not what average user should
> be forced to do. But essentially I don't know if this can be improved,
> maybe if we focus on some concrete examples and just try to create
> methods in compiler to simplify them.

I see only one thing needed from TExpr by macro which is type
information. Probably moving typing from TExpr to PExpr could solve
most of the problems and make API more consistent.

> For example http://nemerle.org/svn/nemerle/trunk/macros/core.n in
> 'foreach' macro there is single call to typer.TypeExpr(PExpr) and
> then we just match on the MType... which is maybe not the simplest
> datastructure, but it rather couldn't be simpler. Also there are uses
> of SuperType and DelayMacro. Second one might be confusing, but it
> makes macro work better with our type inference algorithm.

It does, but the name DelayMacro is still confusing and looks foreign.

>> What we are doing right now is closing some gaps. But instead we
>> should redesign the macro system to meet user expectations which are
>> far bigger than just if/else macro.
>>

> Probably defining those expectations (and filtering out those *really*
> impossible) would help, especially in any refactoring of compiler.

1. Micro DSLs such as linq extension and build-in xml. We have
problems with both right now

2. Type level macro:

table Person
{
field ID : int primary key;
field FirstName : string(50);
field LastName : string(50);
...
}

lexer UriParser
{
Uri ::= [ absoluteURI | relativeURI ] [ "#" fragment ];
absoluteURI ::= scheme ":" ( hier_part | opaque_part );
relativeURI ::= ( net_path | abs_path | rel_path ) [ "?" query ];
...
alpha ::= AlphaTest;

AlphaTest(c : char) : bool
{
char.IsLetter(c)
}
}

Even 'macro' could be a macro :)

3. Support for inheritance:

[ImplementCustomTypeDescriptor]
public abstract class EntiryBase
{
}

Here I expect implementation of ICustomTypeDescriptor for every
child of the EntityBase class.

4. And so on (Lets use our imagination).

>>> We could consider some other approaches here:
>>> - require expression macros modifying compiler's state to be annotated
>>> with some special attribute, like [MutatesCompiler]. This would cause
>>> complier to lock itself for the time of this macro's execution to
>>> avoid concurrent mutation of compiler by two macros. If we detect at
>>> compiler's execution time that you try to mutate it, but don't have
>>> this attribute defined, then we could print some error, though this
>>> might not be easy to implement correctly.
>>> - for the problem of error reporting mode maybe compiler should
>>> automatically ignore any Define calls if it is in this mode?

We can. But again, it is just closing a gap.

My point is we should stop making this kind of changes, release
version 1.0, turn on our imagination, and start working on the next
version. Of cause we will have to refactor the compiler first to make
it highly maintainable and prepared for changes.

--
Best regards,
Igor
mailto:igor.t...@gmail.com

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 22, 2008, 12:51:30 PM12/22/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
2008/12/22 Igor Tkachev <igor.t...@gmail.com>:

>
> Hello Kaм?l,
>
>>> I think there is another bigger problem. Almost any more or less
>>> complicated macro requires type information. But the way to obtain it,
>>> compiling PExpr subtree -> get TExpr -> get type, seems to be added
>>> later after PExpr and TExpr were designed. I guess that design
>>> considered only PExpr to be involved in macro to develop simple things
>>> such as if/else. But life is life.
>
>> This is exactly how it happened. Dealing with TExpr in macros was
>> always hard, or rather it was of the same difficulty as doing so
>> inside compiler / typing engine, which is not what average user should
>> be forced to do. But essentially I don't know if this can be improved,
>> maybe if we focus on some concrete examples and just try to create
>> methods in compiler to simplify them.
>
> I see only one thing needed from TExpr by macro which is type

def ty = Nemerle.Macros.ImplicitCTX().TypeExpr(expr).Type;

> information. Probably moving typing from TExpr to PExpr could solve
> most of the problems and make API more consistent.

Do you mean getting rid of TExpr and leaving only PExpr? This would
pull probably some other structures to be merged with Parsetree
equivalents, add huge amount of new variant options not related to any
syntax construct and bring *lots* of unrelated code to the place we
currently have just parse tree information.
Maybe that's good, it would be cleaner in some sense to have single
variant for the program tree across compilation.

>
>> For example http://nemerle.org/svn/nemerle/trunk/macros/core.n in
>> 'foreach' macro there is single call to typer.TypeExpr(PExpr) and
>> then we just match on the MType... which is maybe not the simplest
>> datastructure, but it rather couldn't be simpler. Also there are uses
>> of SuperType and DelayMacro. Second one might be confusing, but it
>> makes macro work better with our type inference algorithm.
>
> It does, but the name DelayMacro is still confusing and looks foreign.
>

Looks like you cannot have both: powerful type inference and powerful
macros having easy API to check types. At least with our current
inference algorithm, which is based on delaying the computation of
type to the moment it is possible.
DelayMacro is similar to asynchronous method call, where instead of
getting result and doing some work immediately, you need to provide a
callback to finish later.

However this particular use of DelayMacro could be solved by making
macro require *typed* collection as parameter:
macro foreach (x, collection : TExpr, body)

compiler could automatically delay execution of macro until it knows
the type of collection... though it means mixing PExpr and TExpr in
arguments, so it would be a mess to implement.

>>> What we are doing right now is closing some gaps. But instead we
>>> should redesign the macro system to meet user expectations which are
>>> far bigger than just if/else macro.
>>>
>
>> Probably defining those expectations (and filtering out those *really*
>> impossible) would help, especially in any refactoring of compiler.
>
> 1. Micro DSLs such as linq extension and build-in xml. We have
> problems with both right now
>
> 2. Type level macro:
>
> table Person
> {
> field ID : int primary key;
> field FirstName : string(50);
> field LastName : string(50);
> ...
> }
>
> lexer UriParser
> {
> Uri ::= [ absoluteURI | relativeURI ] [ "#" fragment ];
> absoluteURI ::= scheme ":" ( hier_part | opaque_part );
> relativeURI ::= ( net_path | abs_path | rel_path ) [ "?" query ];
> ...
> alpha ::= AlphaTest;
>
> AlphaTest(c : char) : bool
> {
> char.IsLetter(c)
> }
> }
>

You could do most of this with current macros and syntax extensions.
You just need to have 'class' somewhere, like

table class Person {


field ID : : int primary key;
field FirstName : string(50);
field LastName : string(50);
...
}

> Even 'macro' could be a macro :)
>

Yeah, I think it could be written as macro with current system. What
macro does is just generating a class with some specific interface and
attributes attached.

> 3. Support for inheritance:
>
> [ImplementCustomTypeDescriptor]
> public abstract class EntiryBase
> {
> }
>
> Here I expect implementation of ICustomTypeDescriptor for every
> child of the EntityBase class.
>

Right, this was in the plans from the very long time ago, but never
got implemented. It is even possible to declare that macro should be
inherited, but it is not.

> 4. And so on (Lets use our imagination).
>
>>>> We could consider some other approaches here:
>>>> - require expression macros modifying compiler's state to be annotated
>>>> with some special attribute, like [MutatesCompiler]. This would cause
>>>> complier to lock itself for the time of this macro's execution to
>>>> avoid concurrent mutation of compiler by two macros. If we detect at
>>>> compiler's execution time that you try to mutate it, but don't have
>>>> this attribute defined, then we could print some error, though this
>>>> might not be easy to implement correctly.
>>>> - for the problem of error reporting mode maybe compiler should
>>>> automatically ignore any Define calls if it is in this mode?
>
> We can. But again, it is just closing a gap.
>
> My point is we should stop making this kind of changes, release
> version 1.0, turn on our imagination, and start working on the next
> version. Of cause we will have to refactor the compiler first to make
> it highly maintainable and prepared for changes.
>

Incremental fixes works much better in practice than plans for full
redesign and rewrite. I'm fully for the latter, but both: designing
and implementing are huge tasks and I'm not sure how and who can do
this.

Dmitry Ivankov

unread,
Dec 22, 2008, 1:14:14 PM12/22/08
to nemer...@googlegroups.com
Btw, I'm going to add new kind of a tree - EmitTree, it's necessary to make some transformations on it before using emit API to get valid programs (debug nops, and skipped unreachable yield labels currently cause troubles).
It's not related to the topic directly, but if PExpr and TExpr nature is very different in fact then it's one more example of a different tree :)

Igor Tkachev

unread,
Dec 22, 2008, 1:47:03 PM12/22/08
to Dmitry Ivankov, nemer...@googlegroups.com
Hello Dmitry,

> Btw, I'm going to add new kind of a tree - EmitTree, it's necessary
> to make some transformations on it before using emit API to get
> valid programs (debug nops, and skipped unreachable yield labels currently cause troubles).
> It's not related to the topic directly, but if PExpr and TExpr
> nature is very different in fact then it's one more example of a different tree

It should be TExpr. PExpr should be used to manipulate by macros and
other high level transformations. Purpose of TExpr should be
optimizations and IL/whatever generation. No more tree is needed.

--
Best regards,
Igor
mailto:i...@rsdn.ru

Dmitry Ivankov

unread,
Dec 22, 2008, 2:21:35 PM12/22/08
to nemer...@googlegroups.com
No, i think it's too lowlevel to use it in TExpr, it's not likely that we'll perform optimizations there and correct IL is not a thing we should think of on early stages. Also I'm thinking on killing TExpr.Goto and TExpr.Label, I'm almost sure it's not needed on early stages and is quite unsafe (it waits for reimplementing yield).
So it has to be a separate private tree mostly to workaround emit api unfriendliness. Moreover this stuff can be performed only on almost raw IL, I don't see why we should mix IL and higher constructs, they'll just be on each others way.

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 22, 2008, 2:24:31 PM12/22/08
to nemer...@googlegroups.com
2008/12/22 Dmitry Ivankov <divan...@gmail.com>:

Hm, could you reconsider / check your idea once again? We once have
CExpr or something and it was a real pain:
- it duplicated most of the TExpr nodes, which was a maintenance nightmare
- rewriting TExpr to CExpr was a performance penalty, like 5-10%
- it wasn't really necessary, it just had the same information as TExpr

Maybe you can just add an option or two to TExpr to handle some kind
of blocks specially?

Igor Tkachev

unread,
Dec 22, 2008, 2:26:52 PM12/22/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
Hello Kaмil,

>> I see only one thing needed from TExpr by macro which is type

> def ty = Nemerle.Macros.ImplicitCTX().TypeExpr(expr).Type;

Yeah, welcome to the beyond Nemerle :)

BTW, why ImplicitCTX is static? As a macro is a class, it could be a
member of this class.

>> information. Probably moving typing from TExpr to PExpr could solve
>> most of the problems and make API more consistent.

> Do you mean getting rid of TExpr and leaving only PExpr?

No, I do not mean it. TExpr is needed for low-level optimizations and
IL-generation.

> Looks like you cannot have both: powerful type inference and powerful
> macros having easy API to check types.

We have it right now. All we need is make it more consistent and
developer friendly.

> However this particular use of DelayMacro could be solved by making
> macro require *typed* collection as parameter:
> macro foreach (x, collection : TExpr, body)

Nice idea. We could use restrictions and hints in more intelligent
way. What if compiler get known somehow that expectation for
collection type is array, IEnumerable, or IEnumerable[T]?

> compiler could automatically delay execution of macro until it knows
> the type of collection... though it means mixing PExpr and TExpr in
> arguments, so it would be a mess to implement.

That is why I said - typing should be moved to PExpr.

> Incremental fixes works much better in practice than plans for full
> redesign and rewrite. I'm fully for the latter, but both: designing
> and implementing are huge tasks and I'm not sure how and who can do
> this.

Incremental fixes is definitely better in short perspective. However,

incremental fix
incremental fix
incremental fix
incremental fix
...

is getting worse and worse which is happening to the compiler.

There is another way:

incremental fix
incremental fix
refactoring
new feature
bug fix
refactoring
redesign
incremental fix
refactoring
...

I've used this way for years and it allows me keeping my code like it
has been designed and implemented just yesterday :)

There is only one problem - backward compatibility. The language
itself will be compatible, compiler API is probably not. That is why I
am talking about releasing first version.

Dmitry Ivankov

unread,
Dec 22, 2008, 2:43:26 PM12/22/08
to nemer...@googlegroups.com
It doesn't duplicate TExpr, it rather wraps ILEmitter members calls into tree nodes.
Pattern is | Opcode | Call | Throw | Tryblock | Label | SequencePoint | DeclLocal | Sequence (CExpr, CExpr)
I just rewritten calls to ILgenerator with tree construction, then transform tree and walk it with real generator.
 

- rewriting TExpr to CExpr was a performance penalty, like 5-10%
Haven't noticed slowdown, but can check this once more.
In fact current transformation can be done over a stream of calls, but is harder to implement and support, it'll be much smarter wrap around emitter, but without new tree.
 

- it wasn't really necessary, it just had the same information as TExpr
 What was it like?

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 22, 2008, 2:45:21 PM12/22/08
to nemer...@googlegroups.com, nemer...@googlegroups.com
2008/12/22 Igor Tkachev <igor.t...@gmail.com>:

>
> Hello Kaмil,
>
>>> I see only one thing needed from TExpr by macro which is type
>
>> def ty = Nemerle.Macros.ImplicitCTX().TypeExpr(expr).Type;
>
> Yeah, welcome to the beyond Nemerle :)
>
> BTW, why ImplicitCTX is static? As a macro is a class, it could be a
> member of this class.
>

That is a nuisance: ImplicitCTX is actually a macro, which points to
the hidden parameter of macro's Run method. It could be changed to
member field of macro and we could access typer with property of
macro's class, but I just thought that passing typer in parameter
would be faster (and now with mentioning of thread-safety, this is a
proper way).

>>> information. Probably moving typing from TExpr to PExpr could solve
>>> most of the problems and make API more consistent.
>
>> Do you mean getting rid of TExpr and leaving only PExpr?
>
> No, I do not mean it. TExpr is needed for low-level optimizations and
> IL-generation.
>
>> Looks like you cannot have both: powerful type inference and powerful
>> macros having easy API to check types.
>
> We have it right now. All we need is make it more consistent and
> developer friendly.
>
>> However this particular use of DelayMacro could be solved by making
>> macro require *typed* collection as parameter:
>> macro foreach (x, collection : TExpr, body)
>
> Nice idea. We could use restrictions and hints in more intelligent
> way. What if compiler get known somehow that expectation for
> collection type is array, IEnumerable, or IEnumerable[T]?
>

Hm, right. This could be used by complier, it's a matter of adding
several Upper or Lower bounds on TyVar.

>> compiler could automatically delay execution of macro until it knows
>> the type of collection... though it means mixing PExpr and TExpr in
>> arguments, so it would be a mess to implement.
>
> That is why I said - typing should be moved to PExpr.
>

Sorry, but this sounds a bit crazy to me. Did you look at the
http://nemerle.org/svn/nemerle/trunk/ncc/typing/TypedTree.n and whole
ncc/typing directory? TExpr is much richer than PExpr and it is all
deeply used throughout multiple typing stages. Moving typing to PExpr
is probably effectively equivalent to getting rid of TExpr and merging
all the datastructures into Parsetree.

PExpr <---> close representation of syntax written by user, with
support for quotation generation and analysis
TExpr <---> program's representation full of semantic information,
differentiation based on kind of method / field (static, generic,
etc.) and execution planning

As I said before, merging them has some advantages, makes the tree
uniform, etc. but it has also many disadvantages.

--
Kamil Skalski
http://kamil-skalski.pl

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 22, 2008, 2:55:53 PM12/22/08
to nemer...@googlegroups.com

Then it is something different than what we had. What do you do with
this tree after it is constructed, which is not possible to do
directly on TExpr?

>>
>> - rewriting TExpr to CExpr was a performance penalty, like 5-10%
>
> Haven't noticed slowdown, but can check this once more.
> In fact current transformation can be done over a stream of calls, but is
> harder to implement and support, it'll be much smarter wrap around emitter,
> but without new tree.
>
>>
>> - it wasn't really necessary, it just had the same information as TExpr
>
> What was it like?

See diff for r5165 (using svn client, it's not available on web)

Igor Tkachev

unread,
Dec 22, 2008, 3:07:59 PM12/22/08
to Dmitry Ivankov, nemer...@googlegroups.com
Hello Dmitry,

> It doesn't duplicate TExpr, it rather wraps ILEmitter members calls into tree nodes.
> Pattern is | Opcode | Call | Throw | Tryblock | Label |
> SequencePoint | DeclLocal | Sequence (CExpr, CExpr)
> I just rewritten calls to ILgenerator with tree construction, then
> transform tree and walk it with real generator.

If you need a wrapper over IL, you can use an idea implemented in
BLToolkit - http://www.bltoolkit.net/Doc/Reflection/Emit/HelloWorld.htm

In this case your code will look like native MSIL.

Dmitry Ivankov

unread,
Dec 22, 2008, 3:11:01 PM12/22/08
to nemer...@googlegroups.com
I run reachability analysis and do skipping unreachable here (there is also a variant |SkipCookie (reported_expr)).
And also insert "nop" or "br itself" in some cases - ilemitter declares a label just after try block, but there can be no expression if it's skipped, so no label and bad IL.
One more funny thing is that "if (true) A else B" didn't cause skipping B, but with my stuff it does :))
 


>>
>> - rewriting TExpr to CExpr was a performance penalty, like 5-10%
>
> Haven't noticed slowdown, but can check this once more.
It's ~5sec slower (4sec user + 1sec sys) with overral (3m real, 2m42s real, 1.3s sys without patch)
Just one run of make clean && time make boot
 

> In fact current transformation can be done over a stream of calls, but is
> harder to implement and support, it'll be much smarter wrap around emitter,
> but without new tree.
>
>>
>> - it wasn't really necessary, it just had the same information as TExpr
>
>  What was it like?

See diff for r5165 (using svn client, it's not available on web)
Ok, I'll have a look, it's really somewhat different at a first sight :)

Dmitry Ivankov

unread,
Dec 22, 2008, 3:19:09 PM12/22/08
to nemer...@googlegroups.com
It's not what I need, maybe creating these shortcuts and pipeline stuff will look nice, but it's not the main thing.
Will that wrapper help here?
IL code is like
some_method {
throw Exception();
nop;
}
which is not verifiable and requires patching :(

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 22, 2008, 3:31:11 PM12/22/08
to nemer...@googlegroups.com

Usually we do this kind of analysis in
http://nemerle.org/svn/nemerle/trunk/ncc/generation/Typer4.n
and save some markers as

[System.Flags]
enum TExprFlags
{
| IsAssigned = 0x0001
| Visited = 0x0002

| Throws = 0x0004
| ThrowsComputed = 0x0008

| NeedAddress = 0x0010

| Addressable = 0x0020
| AddressableComputed = 0x0040

| JumpTarget = 0x0080

| Constrained = 0x0100

| GenerateTail = 0x0200

| SkipWriteCheck = 0x0400

| NeedsEmptyStack = 0x0800
| NeedsEmptyStackComputed = 0x1000
}


Maybe you could do the same here? Like "Unreachable" flag (from your
description this is the only thing you do + a way to handle this
unreachable code to make PEVerify happy)?

Dmitry Ivankov

unread,
Dec 22, 2008, 3:41:58 PM12/22/08
to nemer...@googlegroups.com
It'll require adding TExpr.Nop, expanding TExpr.DebugInfo to it, and I'm now not sure that this analysis can be done over TExpr right. The problem is that it needs to be updated when TExpr->IL transform changes.
For now it's true that only reachability analysis is sufficent.
I'd better give steroids to NemerleGenerator(one more weird thing :)) ) than merge IL with TExpr as I want robust solution :)

Dmitry Ivankov

unread,
Dec 22, 2008, 3:55:08 PM12/22/08
to nemer...@googlegroups.com
It's possible in fact, but we should ensure that this flag will be inherited by the IL generated, which isn't true for DebugInfo, I think a can postpone this stuff until yield is rewritten and DebugInfo is refactored a bit and then look at the problem once more.
And I still think that goto can be avoided and replaced with smth to simplify analysis.

Igor Tkachev

unread,
Dec 22, 2008, 5:31:10 PM12/22/08
to nemer...@googlegroups.com
Hello Dmitry,

> It's possible in fact, but we should ensure that this flag will be
> inherited by the IL generated, which isn't true for DebugInfo, I
> think a can postpone this stuff until yield is rewritten and
> DebugInfo is refactored a bit and then look at the problem once more.
> And I still think that goto can be avoided and replaced with smth to simplify analysis.

That is what I am talking about. We need one tree for high level
manipulations and another one for low level optimizations. Today we
have two trees, both for middle level :)

Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 23, 2008, 4:44:54 AM12/23/08
to nemer...@googlegroups.com
2008/12/22 Igor Tkachev <igor.t...@gmail.com>:
PExpr is for frontend or "skin" level, it does not exist after very
first moment typer is started (and later typer has several more
passes). TExpr is used entirely as semantic internal tree of program -
it is an intermediate representation
(http://en.wikipedia.org/wiki/Intermediate_representation) between
syntax and IL.
I can agree that sometimes it is useful to have another IR between
typed tree and asm, so optimizations can be done easier, but so far we
were successful to just distinguish several nodes like Goto,Label,
etc. (and mark them in comments as not used before some stage) and use
them in addition to keeping the whole tree as it was.

>
> --
> Best regards,
> Igor
> mailto:igor.t...@gmail.com
>
>
> >
>



Ķaмȋļ ๏ Şκaļşκȋ

unread,
Dec 24, 2008, 5:56:44 PM12/24/08
to nemer...@googlegroups.com
2008/12/22 Igor Tkachev <igor.t...@gmail.com>:

>
> Hello Kaм?l,
>
>>> I think there is another bigger problem. Almost any more or less
>>> complicated macro requires type information. But the way to obtain it,
>>> compiling PExpr subtree -> get TExpr -> get type, seems to be added
>>> later after PExpr and TExpr were designed. I guess that design
>>> considered only PExpr to be involved in macro to develop simple things
>>> such as if/else. But life is life.
>
>> This is exactly how it happened. Dealing with TExpr in macros was
>> always hard, or rather it was of the same difficulty as doing so
>> inside compiler / typing engine, which is not what average user should
>> be forced to do. But essentially I don't know if this can be improved,
>> maybe if we focus on some concrete examples and just try to create
>> methods in compiler to simplify them.
>
> I see only one thing needed from TExpr by macro which is type
> information. Probably moving typing from TExpr to PExpr could solve
> most of the problems and make API more consistent.
>
>> For example http://nemerle.org/svn/nemerle/trunk/macros/core.n in
>> 'foreach' macro there is single call to typer.TypeExpr(PExpr) and
>> then we just match on the MType... which is maybe not the simplest
>> datastructure, but it rather couldn't be simpler. Also there are uses
>> of SuperType and DelayMacro. Second one might be confusing, but it
>> makes macro work better with our type inference algorithm.
>
> It does, but the name DelayMacro is still confusing and looks foreign.
>

Maybe we should have an alternative TypeExpr method, which would take
a lambda and call it only after the type is fully available. So
instead of

def tExpr = typer.TypeExpr(expr);

def result = match (tExpr.Type.Hint)
{
| Some(ty) => $"known as $(ty)"
| None => "unknown"
}


def result = typer.DelayMacro(fun (fail_loudly)
{
def tExpr = tExpr;
match (tExpr.Type.Hint)
{
| Some(ty) =>
// do something with the type
Some(<[ () ]>)

| None => ...
});

we would treat TypeExpr as async call, like it's done in distributed
programming:

typer.TypeExprOrDefer(pexpr, fun (ty, _fail_loudly) {
// here put whole logic of the program you wanted to write after
def ty = typer.TypeExpr(pexpr)
def result = match (tExpr.Type.Hint)
{
| Some(ty) => $"known as $(ty)"
| None => "unknown"
}
....
})

Possibly this construct could be hidden by some macro, though I doubt
it would make things more understandable.

--
Kamil Skalski
http://kamil-skalski.pl

Dmitry Ivankov

unread,
Dec 24, 2008, 9:23:29 PM12/24/08
to nemer...@googlegroups.com
Hmm, TypeExpr isn't enough, macro wants a Hint too.
I'd like to try following:
- kill DelayMacro
- if macro needs type information is does following
-- generates a synthetic list[OverloadPossibility]
-- returns TExpr.Delayed with kind OverloadedMacro
which is resolved by usual overload resolution and then calling TypeExpr (resolvedmacro.Run()) 
So it's moving part of macro logic to overload engine, which restricts macro a bit, but otoh macro can implicitly give a type hint itself :))
Very experimental, but can give smth interesting, I'll drop results to the list when I'll try it :)
At a first sight lock macro is "_ : class", and foreach is "_ : array[_]", "_ : list[_]", "_ : IEnumerable " and "_ : class /*and has GetEnumerator ()*/" and there are no more DelayMacro in ncc itself.
Reply all
Reply to author
Forward
0 new messages