Status update: Assembly programmers needed

39 views
Skip to first unread message

Adrian

unread,
Jun 7, 2010, 5:14:02 AM6/7/10
to anic
So I've been crunching away at anic for several weeks now, and the
type flow engine is complete and largely stable. This means we're
moving to code gen, which is actually going to be very natural and
intuitive to evolve from the current code base, considering that the
way that the data flows mirrors pretty much exactly how the types flow
in ANI programs (and the type flow derivations are a solved problem).

ANI has evolved in the process of designing the type system: some of
the new features include object injection (which can be used for very
intuitive and robust multi-class inheritance, among other things),
structural object equivalence (with associated automatic structural
recasting), new data suffixes (there are now constants, latches,
lists, streams, arrays, and pools), and more. Operator overloading is
planned in the near future, as well. The good news: none of this
breaks any of the basic old ANI concepts as they are presented
currently in the tutorial, and the new concepts give the language a
whole lot of extra expressive power and adaptability. The bad news:
the tutorial needs to be massively expanded to handle these concepts,
and I guess since I'm the one that implemented these ideas, I'll have
to be the one to write up detailed walkthroughs for all of them
(unless someone is willing to come forward to help).

In any case, I will soon be defining the intermediate representation
trees and generating them from the type-annotated parse trees and
symbol tables that we have now. What would speed things up
tremendously would be if someone was willing to work ahead and write
some code dumping functions that takes these basic intermediate
representation trees and dumps out assembly text from them; experience
with assembly and compilers in general would be required here, as
things such as (reasonably intelligent) register allocators will need
to be implemented (or wired in from other GPL sources such as gcc). I
could of course do this all myself, but this is an opportunity for
parallelizing things a great deal -- once the intermediate
representation tree interface is defined, working on assembly dumps
should be able to proceed largely independently of the rest of the
compiler.

Our first planned target is x86 and/or x86-64, so someone willing to
implement this ASM dump translation layer for x86 would be a huge help
to ANI/anic right now. Once we have that and a one- or two- page
runtime dispatcher, we'll be able to run ANI code!

Thoughts? Volunteers?

Thanks for reading!
Adrian (Project lead)

Daniel Kersten

unread,
Jun 7, 2010, 9:04:21 AM6/7/10
to anic
Has the new code been merged into the default branch or which one (default, map-binding, new-binding or new-typing) should I be looking at?

On 7 June 2010 10:14, Adrian <ult...@gmail.com> wrote:
So I've been crunching away at anic for several weeks now, and the
type flow engine is complete and largely stable. This means we're
moving to code gen, which is actually going to be very natural and
intuitive to evolve from the current code base, considering that the
way that the data flows mirrors pretty much exactly how the types flow
in ANI programs (and the type flow derivations are a solved problem).

ANI has evolved in the process of designing the type system: some of
the new features include object injection (which can be used for very
intuitive and robust multi-class inheritance, among other things),
structural object equivalence (with associated automatic structural
recasting), new data suffixes (there are now constants, latches,
lists, streams, arrays, and pools), and more. Operator overloading is
planned in the near future, as well. The good news: none of this
breaks any of the basic old ANI concepts as they are presented
currently in the tutorial, and the new concepts give the language a
whole lot of extra expressive power and adaptability. The bad news:
the tutorial needs to be massively expanded to handle these concepts,
and I guess since I'm the one that implemented these ideas, I'll have
to be the one to write up detailed walkthroughs for all of them
(unless someone is willing to come forward to help).

Sadly I can't help here - I can't document what I don't know.
 

In any case, I will soon be defining the intermediate representation
trees and generating them from the type-annotated parse trees and
symbol tables that we have now. What would speed things up
tremendously would be if someone was willing to work ahead and write
some code dumping functions that takes these basic intermediate
representation trees and dumps out assembly text from them; experience
with assembly and compilers in general would be required here, as
things such as (reasonably intelligent) register allocators will need
to be implemented (or wired in from other GPL sources such as gcc). I
could of course do this all myself, but this is an opportunity for
parallelizing things a great deal -- once the intermediate
representation tree interface is defined, working on assembly dumps
should be able to proceed largely independently of the rest of the
compiler.

Our first planned target is x86 and/or x86-64, so someone willing to
implement this ASM dump translation layer for x86 would be a huge help
to ANI/anic right now. Once we have that and a one- or two- page
runtime dispatcher, we'll be able to run ANI code!

Thoughts? Volunteers?

Do you want the compiler to generate binaries? Assembly source files (nasm? masm? gas?), LLVM IR?

In any case, I'll have a look at the intermediate tree representation when you have it defined and give it a try.
 

Thanks for reading!
Adrian (Project lead)



--
Daniel Kersten.
Leveraging dynamic paradigms since the synergies of 1985.

Adrian

unread,
Jun 7, 2010, 12:06:17 PM6/7/10
to anic
On Jun 7, 9:04 am, Daniel Kersten <dkers...@gmail.com> wrote:
> Has the new code been merged into the default branch or which one (default,
> map-binding, new-binding or new-typing) should I be looking at?

Everything's been appropriately merged into the default branch, and
all other current branches have are officially been marked "closed" --
these don't have anything new or worth looking at anymore. This is a
good question, though, and it's good that it's being cleared up sooner
rather than later.

>
> On 7 June 2010 10:14, Adrian <ulti...@gmail.com> wrote:
>
>
>
>
>
> > So I've been crunching away at anic for several weeks now, and the
> > type flow engine is complete and largely stable. This means we're
> > moving to code gen, which is actually going to be very natural and
> > intuitive to evolve from the current code base, considering that the
> > way that the data flows mirrors pretty much exactly how the types flow
> > in ANI programs (and the type flow derivations are a solved problem).
>
> > ANI has evolved in the process of designing the type system: some of
> > the new features include object injection (which can be used for very
> > intuitive and robust multi-class inheritance, among other things),
> > structural object equivalence (with associated automatic structural
> > recasting), new data suffixes (there are now constants, latches,
> > lists, streams, arrays, and pools), and more. Operator overloading is
> > planned in the near future, as well. The good news: none of this
> > breaks any of the basic old ANI concepts as they are presented
> > currently in the tutorial, and the new concepts give the language a
> > whole lot of extra expressive power and adaptability. The bad news:
> > the tutorial needs to be massively expanded to handle these concepts,
> > and I guess since I'm the one that implemented these ideas, I'll have
> > to be the one to write up detailed walkthroughs for all of them
> > (unless someone is willing to come forward to help).
>
> Sadly I can't help here - I can't document what I don't know.

Yeah, I realize that unfortunately the burden's going to have to be
on me to do at least rough documentation for the new stuff.

>
>
>
>
>
>
>
> > In any case, I will soon be defining the intermediate representation
> > trees and generating them from the type-annotated parse trees and
> > symbol tables that we have now. What would speed things up
> > tremendously would be if someone was willing to work ahead and write
> > some code dumping functions that takes these basic intermediate
> > representation trees and dumps out assembly text from them; experience
> > with assembly and compilers in general would be required here, as
> > things such as (reasonably intelligent) register allocators will need
> > to be implemented (or wired in from other GPL sources such as gcc). I
> > could of course do this all myself, but this is an opportunity for
> > parallelizing things a great deal -- once the intermediate
> > representation tree interface is defined, working on assembly dumps
> > should be able to proceed largely independently of the rest of the
> > compiler.
>
> > Our first planned target is x86 and/or x86-64, so someone willing to
> > implement this ASM dump translation layer for x86 would be a huge help
> > to ANI/anic right now. Once we have that and a one- or two- page
> > runtime dispatcher, we'll be able to run ANI code!
>
> > Thoughts? Volunteers?
>
> Do you want the compiler to generate binaries? Assembly source files (nasm?
> masm? gas?), LLVM IR?

Assembly source text; that gives the greatest flexibility for linking
with other parts of ANI (the runtime, for example), while also being
the most easily human-debuggable (this will inevitably be an important
concern). I left the specific ASM target platform intentionally vague,
since it doesn't actually matter very much: converting between x86
dialects is a mostly mechanical transformation. Eventually, I'd
probably convert things down to interface with gas, however, since
that would give a uniform interface to nearly all machine
architectures that we'd be supporting. But for those that want a more
specific direction, shoot for x86 on gas with Intel syntax (it's in
general more sane than AT&T syntax).

>
> In any case, I'll have a look at the intermediate tree representation when
> you have it defined and give it a try.

Looking forward to it ;).

Adrian

unread,
Jun 11, 2010, 3:08:21 AM6/11/10
to anic
Okay, so I've devised a intermediate code interface in genner.{h,cpp}

The only kind of tree node that can't be fully implemented yet is the
SchedTree (to schedule a chunk of code to run), as that will require
interfacing to the runtime (which isn't written yet). Everything else
should be relatively easily to extract assembler dumps out of.

The best way to go about this assembly code dumping would probably be
implementing a register/memory spill allocator for TempTree (temporary
data) nodes and a tree tiling algorithm for optimally matching x86
instructions onto the intermediate representation tree -- this would
be most extensible (to other targets) and easiest to work with when
implementing code optimization, which any good compiler will need to
do at some point. This approach would also probably generate far
better code than a naive node-by-node translation of the intermediate
trees into static instructions; since one of the main goals of anic is
producing very fast binaries, this is quite an important concern.

The semantics for all of the intermediate representation tree nodes
should be obvious from their definitions and the comments in the .h,
but I'd be glad to clarify anything if need be. I'd be glad to do all
of the work of interfacing the assembly dumper into the rest of the
compiler; someone just needs to write some functions to take each
LabelTree and produce an x86 assembly source dump (preferably as a
string return value) of the label and the code that it represents.

Dan or anyone else willing to tackle this -- just let me know, and
I'll set up the framework (such as a repository branch) for you to
work in (and give you the perms if you don't yet have them).

Cheers!
Kajetan "Adrian" Biedrzycki

Daniel Kersten

unread,
Jun 11, 2010, 4:45:06 PM6/11/10
to ult...@gmail.com, anic
Hi,

Yep, I'll be taking a look at this over the weekend and I'll let you know how I get on early next week.

My plan is to see how I would implement something like maximal munch pattern/instruction tiling with some crude register spilling (maybe just using the x86 stack for now), but I'll see after I've taken a look at the code, compiled it etc.

Dan.

Adrian

unread,
Jun 11, 2010, 11:33:25 PM6/11/10
to anic
Sounds like a good approach to get off the ground with.

The planned memory allocation model for anic is going to need to
expect a large flat address space to stomp all over, but register
spills are a necessary evil and spilling to some specially reserved
thread-local "stack" is an approach as good as any other; I'll keep
this overhead memory reservation requirement in mind as I build the
intermediate trees.

Let me know if there's anything you need clarified (better to ask than
to assume the wrong thing); I'll try to get some better comments in
there to help as well.

Cheers!
Adrian

On Jun 11, 4:45 pm, Daniel Kersten <dkers...@gmail.com> wrote:
> Hi,
>
> Yep, I'll be taking a look at this over the weekend and I'll let you know
> how I get on early next week.
>
> My plan is to see how I would implement something like maximal munch
> pattern/instruction tiling with some crude register spilling (maybe just
> using the x86 stack for now), but I'll see after I've taken a look at the
> code, compiled it etc.
>
> Dan.
>
Reply all
Reply to author
Forward
0 new messages