Documentation for Engine implementors (V5)

172 views
Skip to first unread message

Arsène von Wyss

unread,
Sep 14, 2012, 12:22:32 PM9/14/12
to gold-pars...@googlegroups.com
Today I started another go to update my engine (bsn) to the new EGT file format, including all its new features. Unfortunately I had to find out that the essential parts in the documentation are missing (pseudocode for instance), and some of the existing is just not correct (some of the records - see appendix), which I found out by looking at the source of the "reference implementation" in VB.NET.

So I see two options really:
  • Not adding the support for EGT until this has been addressed. Unless the docs get updated soon, this is not really great since my engine seems to be well accepted by the .NET community - there are at least two CodeProject articles about it and I got some contribution offers; one of which was from an intern working at a well-known .NET tools manufacturer.
  • Do my best to guess what the desired behavior should be by inspecting Devin's code. It seems to be what Ralph Iden is doing, since he explicitly says that his design follows the original implementation closely, thus allowing for quick adaptation. This is however not really great for me, since my engine differs a bit more and I'd rather use specifications than having to reverse-engineer the sample engine.

Devin, is there a planned schedule for the documentation?
Does anyone have thoughts on the topic which they want to share?

Thanks, Arsène


Appendix - problems in the docs (http://www.goldparser.org/doc/egt/index.htm):
  • Table Counts Records has contradicting info about the character identifying the record - is it t (116) as in the diagram or T (84) as written in the text? Seems to be "t"...
  • Character Set Table is completely off - the unicode plane is not even used in Devin't code, the "range 1..n" shows wrong information (from the source I gather that those should be inclusive-start character code and inclusive-end character code), the "Fields" section lists the olf CGT format information...
  • Group Record shows 1..n nested "group index" integers - but the count of those is not in the diagram, even though it seems to be in the file. Also, shouldn't that be 0..n?

Jay B

unread,
Sep 14, 2012, 7:24:37 PM9/14/12
to gold-pars...@googlegroups.com
I just started using your bsn engine recently and I really like how cleanly and elegantly I was able to implement my language's AST.  It's really a nice piece of software.

I would like to see support for the EGT format as it would shrink the size of my cgt down to almost 1/10th the size.  I don't really know what the benefit would be other than size of grammar tables file.  If the code could get even faster, that would be nice, but it's pretty fast right now.  If there's a good chance of bugs, I'd rather stick with what works for now.

Again, great library.  It was actually a pleasure building my AST.

And thanks to Devin as well for GOLD and the quick turnaround on my recent bug.

Jay B

Arsène von Wyss

unread,
Sep 15, 2012, 10:39:19 AM9/15/12
to gold-pars...@googlegroups.com
Jay, thanks for the positive feedback! I currently have a version that can read the EGT format but which doesn't support groups yet, so I guess this would be of no use to you since you're using the groups functionality. What you can do however is to feed the CGT through the PackCGT program which comes with my library; it typically reduces the size to around 50% but the CGT is then only readable by my library and not by others.

Regarding performance you need to be aware that my library requires some warm-up before achieving full performance. During initialization and the first passes it does plenty of checks and dynamically emits IL code, which then needs to be compiled by the JIT compiler for execution. Therefore the first few runs may seem a bit sluggish, but afterwards the performance is very good (I parse quite extensive SQL code with it in usually less than a millisecond on my PC), and the CompiledGrammar class is also thread-safe, so that you can parallelize the parsing process to scale with your CPU cores if you have multiple independent fragments to parse.

Cheers, Arsène

Devin Cook

unread,
Sep 17, 2012, 2:07:53 PM9/17/12
to gold-pars...@googlegroups.com
Whoops! I need to make some fixes to the documentation. For example,
the chart in the Character Set Table page is wrong. The "range"
integers should be labeled Start Character and End Character.

I'll make some fixes and get the documentation up to speed.

- Devin
> --
> You received this message because you are subscribed to the Google Groups
> "GOLD Parsing System" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/gold-parsing-system/-/CTH4PBbQY7UJ.
> To post to this group, send email to gold-pars...@googlegroups.com.
> To unsubscribe from this group, send email to
> gold-parsing-sy...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/gold-parsing-system?hl=en.

Devin Cook

unread,
Sep 17, 2012, 5:35:12 PM9/17/12
to gold-pars...@googlegroups.com
I just fixed the charts on the EGT documentation. You might have to
hit Refresh for the new version to appear.

On Fri, Sep 14, 2012 at 9:22 AM, Arsène von Wyss <avon...@gmail.com> wrote:

Arsène von Wyss

unread,
Dec 4, 2012, 8:27:39 AM12/4/12
to gold-pars...@googlegroups.com, mi...@devincook.com
Devin, can you give an approximate schedule for the engine pseudo code?
http://goldparser.org/doc/engine-pseudo/index.htm

Thanks, Arsène
Reply all
Reply to author
Forward
0 new messages