Porting guide (engine in Python)

11 views
Skip to first unread message

Dmitry Ponyatov

unread,
Oct 31, 2018, 1:34:45 AM10/31/18
to XL programming language and runtime
What papers/tutorials I must consult first if I want to write porting guide for XL engine? (besides source code)

I want to make the core idea to be visible more clear, not covered by C++ complexity, by implementing the minimal language rewrite core in Python.

Christophe de Dinechin

unread,
Oct 31, 2018, 6:33:55 AM10/31/18
to xlr-...@googlegroups.com


> On 31 Oct 2018, at 06:34, Dmitry Ponyatov <dpon...@gmail.com> wrote:
>
> What papers/tutorials I must consult first if I want to write porting guide for XL engine? (besides source code)
>
> I want to make the core idea to be visible more clear, not covered by C++ complexity, by implementing the minimal language rewrite core in Python.

Ha!

Some (somewhat outdated) docs are here: http://xlr.sourceforge.net.

You may want to start with http://xlr.sourceforge.net/Concept%20Programming%20Presentation.pdf.

As for the language itself, https://github.com/c3d/xl/blob/master/xlr/doc/XLRef.pdf.

All this is in bad need of love ;-)

>
> https://github.com/ponyatov/xl/tree/py
>
> --
> You received this message because you are subscribed to the Google Groups "XL programming language and runtime" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to xlr-talk+u...@googlegroups.com.
> To post to this group, send email to xlr-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/xlr-talk.
> For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Dmitry Ponyatov

unread,
Nov 1, 2018, 3:22:48 AM11/1/18
to XL programming language and runtime
Is it critical for rewrite engine to use only B-tree (not more than 2 branches in a node) representation?

Things like

(infix <CR>
 
(name "Hello")
 
(infix <CR>
 
(name "World")
................)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

makes my mind boiling, and does not support attribute grammar in nodes, which may be extra useful for programming language processing.

In my knowledge base experiments, I use such a like base class:

## root class
class Node:
   
   
## construct leaf with value
   
## @param[in] V value
   
def __init__(self,V):
       
## primitive value
       
self.value = V
        ## attributes (attribute grammar elements, or object slots) # std::map<std::string,Node*>
        
self.attr = {}
       
## nested elements = vector & stack # std::vector<Node*>
       
self.nest = []

able to hold 
  • any count of nested elements in order (array, vector, and stack representation at the same time) and 
  • string-keyed attributes (generic associative array, can be used for attribute grammar).
Does the use of this structure in XLTree class (tree.h) will totally break the rewriting algorithm?
I mean not code fixes for this STL-based enhancements, but the core method of rewrite doing. Maybe B-tree only is a critical requirement.

For example, int main() becomes a clear human-readable structure:
<function:main>
  args =
<block:args>
    <arg:argc>
  type = <type:int>
    <arg:argv>
  type =
<type:array>
    <pointer:*>
      <type:char>
  return =
<type:int>
  <function:init>
  <name:hello>
    <name:World>
  <function:fini>
  <name:return>
    <integer:0>

Dmitry Ponyatov

unread,
Nov 1, 2018, 3:25:31 AM11/1/18
to XL programming language and runtime

Christophe de Dinechin

unread,
Nov 1, 2018, 5:34:34 AM11/1/18
to xlr-...@googlegroups.com


> On 1 Nov 2018, at 08:22, Dmitry Ponyatov <dpon...@gmail.com> wrote:
>
> Is it critical for rewrite engine to use only B-tree (not more than 2 branches in a node) representation?

For source code, yes. It’s a matter of simplicity. But only for source code.

>
> Things like
>
> (infix <CR>
> (name "Hello")
> (infix <CR>
> (name "World")
> ................)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
>
> makes my mind boiling,

It’s only a matter of rendering. It’s nothing more than a linked-list, and like for any linked list, this means accessing the N-th element is N hops. It makes sense to structure it that way because during rewrites, you *never* access the N-th element, you *always* access the surrounding ones.

> and does not support attribute grammar in nodes, which may be extra useful for programming language processing.

There is an “info” field in Tree just for that purpose. You can stick whatever you need there. It won’t show during rendering (and neither should it). It’s designed to be accessed by type, e.g. you could have `tree->GetInfo<MyStuff>()` to get whatever you put in there. The reasons for that design is that the compiler is extensible, so any compiler plugin can efficiently attach their own data to any tree node in whatever format they want.

Only the *source* is binary, though, and there’s a reason for that, see my comments below.

>
> In my knowledge base experiments, I use such a like base class:
>
> ## root class
> class Node:
>
> ## construct leaf with value
> ## @param[in] V value
> def __init__(self,V):
> ## primitive value
> self.value = V
> ## attributes (attribute grammar elements, or object slots) # std::map<std::string,Node*>
> self.attr = {}
> ## nested elements = vector & stack # std::vector<Node*>
> self.nest = []
>
> able to hold
> • any count of nested elements in order (array, vector, and stack representation at the same time) and
> • string-keyed attributes (generic associative array, can be used for attribute grammar).
> Does the use of this structure in XLTree class (tree.h) will totally break the rewriting algorithm?

It would certainly require a very different algorithm. The existing algorithm requires the ability to match one tree type to another.

I have tried this kind of approach before. I actually started that way, see http://mozart-dev.sourceforge.net. In that initial implementation, an “if” would be represented by an “If” node, a “function” by a “function” node, and so on. I’m afraid the source history was lost when SourceForge decommissioned their CVS service.

This approach meant any tiny source transformation plugin had to know about all possible source node type, and could not add their own in a way that others would easily understand. So it proved to be not extensible, and not general enough. Figuring out I could do the same thing with a much simpler tree was a key insight, and happened only after years of tinkering.

I retried limited N-ary recently, see https://github.com/c3d/xl-c-rewrite. In that version, tree has a “handler”, instead of 8 node types, so you can have an infinite number of tree classes. Some of them like “array” (in array.[ch]) are N-ary. I hoped it would be manageable, and help for things like a sequence of line. It turns out out it is not manageable, and I quickly ran into the same issues I had encountered earlier with Mozart, i.e. any tree rewrite must have deep-knowledge of all other tree rewrites, which is exactly what I managed to avoid with what you call B-Tree.

So for now, I’d stick to my current design, i.e.: source is a simple tree, with the ability to add arbitrary meta-data that is not part of the source.


> I mean not code fixes for this STL-based enhancements, but the core method of rewrite doing. Maybe B-tree only is a critical requirement.

As far as I can tell, it is. Not really “B-tree”, more like “a very small and closed set of tree node types”. This is what made Lisp meta-programming possible. Lisp has essentially two node types. XL has 8, so it’s “worse”, but the reason for that is to match the node types to how humans read and write code.

>
> For example, int main() becomes a clear human-readable structure:
> <function:main>
> args = <block:args>
> <arg:argc>
> type = <type:int>
> <arg:argv>
> type = <type:array>
> <pointer:*>
> <type:char>
> return = <type:int>
> <function:init>
> <name:hello>
> <name:World>
> <function:fini>
> <name:return>
> <integer:0>

I positively *love* your definition of “human-readable” ;-)

In that case, how does an arbitrary plugin know what to do with your “return” field, for example?

Try to imagine this. I write the “parse_tree” function which simply copies its input, but evaluates {{X}} by inserting X from the evaluation context into the node. Then you could write code that unrolss N times the same code as:

unroll 1, Body is Body
unroll N, Body is
parse_tree
{{Body}}
{{unroll N-1, Body}}

Now,

unroll 4, write “Hello”

will generate a code that looks like

write “Hello"
write “Hello"
write “Hello"
write “Hello”

I can do that entirely based on manipulations of the tree structure. The “parse_tree” function is quite simple. It's there: https://github.com/c3d/xl/blob/master/xlr/runtime.cpp#L335 (parse_tree_inner is just above, line 268).

I invite you to analyse how you could write a function like this with your structure. How would it know what to do with “return” attributes or “function” tags?

Christophe de Dinechin

unread,
Nov 1, 2018, 5:44:05 AM11/1/18
to xlr-...@googlegroups.com
I forgot to mention that this is how you would do it in C / C++, i.e. manipulating the internal structure directly. If you were writing it in XL2, you’d use the “translate” statement which deals with the boilerplate for you, see https://github.com/c3d/xl/blob/master/xl2/native/xl.plugin.differentiation.xl#L34 for an example of plugin doing a number of rewrites to implement symbolic differentiation.

Christophe de Dinechin

unread,
Nov 6, 2018, 4:25:47 AM11/6/18
to xlr-...@googlegroups.com
Hi,

> On 31 Oct 2018, at 13:22, Dmitry Ponyatov <dpon...@gmail.com> wrote:
>
> Here I put minimal XL0 in Python (in progress): https://github.com/ponyatov/xl/blob/py/py/xl.py

Your approach looks interesting. There is obviously still a lot of work, e.g. to
actually parse numbers the XL way, e.g. accept 16#FF.FF0#e-4 :-) But one step
at a time.

Out of curiosity, two (possibly related) questions:

- why do you want to reimplement from scratch?
- why Python?

These are real questions, not some passive-aggressive reaction. I want to understand
what your goal is in order to help you.


Thanks
Christophe

PS: This may be bad timing, but I’ve been trying to reorganize the XL repositories,
so some of the URLs I gave may break. Sorry about that. I’m basically trying
to bring all the XL repositories into one and make it so that I can safely rebuild
Tao3D on top of recent LLVM variants, then start working on more modern dialects
of XL and bring Tao3D and ELFE along.

>
> I've done class hierarchy and primitive parser, able to parse only numbers with an optional sign prefix.
>
> <inherit_graph_0.png>
>
>
>
> // comment
>
> -01 +02.30 -0.4e+05 0xDeadBeef 0b1101
>
>
> parses into
>
>
>
> <name:->
> <integer:1>
>
> <name:+>
> <real:2.3>
>
> <name:->
> <real:40000.0>
>
> <name:0xDeadBeef>
>
> <name:0b1101>
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "XL programming language and runtime" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to xlr-talk+u...@googlegroups.com.
> To post to this group, send email to xlr-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/xlr-talk.
> For more options, visit https://groups.google.com/d/optout.
> <inherit_graph_0.png>

Dmitry Ponyatov

unread,
Nov 6, 2018, 5:01:39 AM11/6/18
to XL programming language and runtime
- why do you want to reimplement from scratch?
- why Python?

I have a tiny experimental system looks like Forth language, and I want to test what method of metaprogramming is the best for me:
- your very low-grained syntax tree rewrite approach, or
- my over-typed object system with direct object modification on a stack and in objects vocabulary

Another thing I'm interested deeply is inference on graphs of objects linked by relations (something like Minsky frames). First I thought about generic graph rewrite method, that's why I found your XL language.

My base job is embedded development, so I also interested in XL + LLVM as a cross-compiler system may be as powerful as C++ compiler written in Lisp (metaprogramming in compile time without template hell).

In a common, Python is the best for me as the prototyping language, it has no such syntax, memory and low-level coding swamp as C++. Other candidates were Ruby (which I don't know) and Smalltalk has a lot of problems with libraries and no batteries and rich community.

Dmitry Ponyatov

unread,
Nov 6, 2018, 8:42:51 AM11/6/18
to XL programming language and runtime
Reply all
Reply to author
Forward
0 new messages