Low-hanging fruit: a front-end to make, TeX, etc., etc.

96 views
Skip to first unread message

Jeffrey Kegler

unread,
Apr 3, 2013, 1:55:55 PM4/3/13
to Marpa Parser Mailing LIst
Many of our best and most popular programming tools have unlovable
interfaces. What tools should be on the list is a matter of taste.
Among the important and excellent tools with annoying interfaces I would
include:

TeX
autoconf
Module::Build
make
Cweb

Previously, you would have had to rewrite the front-end using the
parsing technologies the original authors had available. This meant
trying to improve on the work of a programmer who was somewhere between
brilliant and a genius. Very arguably, the interfaces were as good as
they could reasonably be expected to get. (Also, a new front-end would
have to be redone every time the tools changes, and revised for each new
version and variant of the tool, which would have been quite the burden.)

With Marpa, creating a new interface becomes much easier. And since
you'd be working with a clean lexer and BNF, revising the front-end (or
developing it incrementally) also becomes much easier.

-- jeffrey

Deyan Ginev

unread,
Apr 3, 2013, 2:13:23 PM4/3/13
to marpa-...@googlegroups.com
Putting TeX in the same bunch with the rest is not really appropriate, as TeX is a Turing-complete programming language that actively interprets its input and modifies its behavior on demand. Using Marpa there would be a bigger hindrance than help, as the processing model is a depth-first expansion that modifies the state of the interpreter.

In fact, my interest in Marpa comes from co-developing a rewrite of TeX's parser called LaTeXML, which reorients the processing towards the creation of XML documents and adds stronger support for semantic macros. Looking at the processing model of TeX, the one place where Marpa fits is for processing mathematical expressions into ASTs which can then be serialized as XML. Of course, you could use Marpa to support a simplification of TeX, i.e. its standard macros and environments, ignoring all catcode and programming-near logic. Many convenience tools do that, e.g. MathJaX which parses a subset of LaTeX's predefined macros, especially those used for writing math expressions.

Greetings,
Deyan




-- jeffrey

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



Jeffrey Kegler

unread,
Apr 3, 2013, 2:55:02 PM4/3/13
to marpa-...@googlegroups.com
@Deyan: LaTeXML sounds interesting.  Is there a write-up somewhere?

You're right, of course, that TeX is self-modifying, so that a TeX-front end could not, directly, give access to the full TeX capabilities.  In fact, self-modification issues apply to all the examples I listed -- even "make" has some self-modifying features, IIRC.  But, back to the example of TeX, my sense is that while authors of packages often deal in self-modification heavily, actual TeX documents make no or restricted direct use of self-modification.

A possible approach is something like what is often done when work at the assembler language level is required these days.  A typical approach is to use a C compiler with an assembly language extension.  Essentially, everything you can in the convenient language (C), you do.  For the stuff that C cannot do, you use the assembly language extension.

Another, easier, approach is to make the front-end language "small" and focused.  Rather than attempt to provide access to the full Tex (or make or autoconf) functionality, it'd give access to only a subset.

-- jeffrey

Deyan Ginev wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.

Deyan Ginev

unread,
Apr 3, 2013, 3:28:29 PM4/3/13
to marpa-...@googlegroups.com
On Wed, Apr 3, 2013 at 2:55 PM, Jeffrey Kegler <jeffre...@jeffreykegler.com> wrote:
@Deyan: LaTeXML sounds interesting.  Is there a write-up somewhere?

Sure. The best way to read about it is possibly the extensive Manual available at:
http://dlmf.nist.gov/LaTeXML/manual.pdf (PDF)
http://dlmf.nist.gov/LaTeXML/manual/index.xhtml (HTML)
 
You're right, of course, that TeX is self-modifying, so that a TeX-front end could not, directly, give access to the full TeX capabilities.  In fact, self-modification issues apply to all the examples I listed -- even "make" has some self-modifying features, IIRC.  But, back to the example of TeX, my sense is that while authors of packages often deal in self-modification heavily, actual TeX documents make no or restricted direct use of self-modification.

That might be a reasonable statement as far as typical LaTeX articles are concerned but tends to be much worse for plain TeX and any larger real-world document. But in any case, yes, there is a use case for simple TeX parsers.
 
A possible approach is something like what is often done when work at the assembler language level is required these days.  A typical approach is to use a C compiler with an assembly language extension.  Essentially, everything you can in the convenient language (C), you do.  For the stuff that C cannot do, you use the assembly language extension.

C has a real specification and there is a lot of sanity there. TeX's "specification" is simply Knuth's implementation code in combination with the "TeX book", and things are far away from the rigor of C syntax. So I expect attempting something C-like to be a nightmare.
 
Another, easier, approach is to make the front-end language "small" and focused.  Rather than attempt to provide access to the full Tex (or make or autoconf) functionality, it'd give access to only a subset.

That has a lot more merit to it and has been done frequently in the past to different extents (for a related overview, see http://kwarc.info/kohlhase/submit/dml09.pdf). What that review doesn't cover is that there is a new high profile solution for writing TeX mathematics on the web, called MathJaX, which is a client-side JavaScript application. TeX-like languages have a good history of being used on the web, e.g. the TeX math markup, usually in combination with Wiki markup. Ah! Actually that would be a nice and more high-profile use of Marpa -- to create frontend parsers for Wiki and Markdown markup. They should be dead simple to make.

Now I got myself excited about writing a Markdown parser in Marpa, hopefully I get some free time in the coming weeks to try that out (or even better - someone beats me to it!)

Deyan
 

-- jeffrey

Ron Savage

unread,
Apr 3, 2013, 7:53:04 PM4/3/13
to marpa-...@googlegroups.com
Hi Deyan

See below.

On 04/04/13 06:28, Deyan Ginev wrote:
> On Wed, Apr 3, 2013 at 2:55 PM, Jeffrey Kegler<
> jeffre...@jeffreykegler.com> wrote:
[snip]
> Now I got myself excited about writing a Markdown parser in Marpa,
> hopefully I get some free time in the coming weeks to try that out (or even
> better - someone beats me to it!)

Before you start with Markdown, note that it's (hopefully) obsoleted by
pandoc:

http://johnmacfarlane.net/pandoc/

I've just started to use pandoc for articles:

http://savage.net.au/jQuery-tutorials.html

--
Ron Savage
http://savage.net.au/
Ph: 0421 920 622

Deyan Ginev

unread,
Apr 3, 2013, 8:18:27 PM4/3/13
to marpa-...@googlegroups.com

Hi Ron,

On Wed, Apr 3, 2013 at 7:53 PM, Ron Savage <r...@savage.net.au> wrote:
[snip]

Now I got myself excited about writing a Markdown parser in Marpa,
hopefully I get some free time in the coming weeks to try that out (or even
better - someone beats me to it!)

Before you start with Markdown, note that it's (hopefully) obsoleted by pandoc:

http://johnmacfarlane.net/pandoc/

I've just started to use pandoc for articles:

http://savage.net.au/jQuery-tutorials.html


How would that render obsolete a Markdown parser written in Marpa? From what I know Pandoc is a Haskell application, so that gives you Markdown support in Haskell, GitHub offers markdown support in Ruby, and I recall seeing a JavaScript Markdown renderer. I haven't seen one in Perl yet, and why not serve Markdown pages in e.g. Perl web applications using a Marpa renderer to HTML?

I see this akin to the Marpa JSON parser that I've seen before in this space - each programming language has its own JSON library, Perl even has several. Similarly for markdown renderers. And I think it might be a pretty nice showcase.

Cheers,
Deyan

Zakariyya Mughal

unread,
Apr 3, 2013, 9:07:21 PM4/3/13
to marpa-...@googlegroups.com
On 2013-04-03 at 11:55:02 -0700, Jeffrey Kegler wrote:
> Another, easier, approach is to make the front-end language "small"
> and focused. Rather than attempt to provide access to the full Tex
> (or make or autoconf) functionality, it'd give access to only a
> subset.
>

This is precisely the approach I would like to take with a future
project of mine. I'm interested in using Marpa for developer tools such
as static analysis and code completion. I think that the latter can
greatly benefit from using Marpa's progress reports and Ruby slippers
parsing, because code that is being typed up may not be in a state where
it completely conforms to the grammar.

For LaTeX, it would probably be best to see how it is used "in-the-wild"
and target parts of grammar to supporting specific packages such as
keyvals and TikZ.

Cheers,
- Zaki Mughal

> -- jeffrey
>
> Deyan Ginev wrote:
> >Putting TeX in the same bunch with the rest is not really
> >appropriate, as TeX is a Turing-complete programming language that
> >actively interprets its input and modifies its behavior on demand.
> >Using Marpa there would be a bigger hindrance than help, as the
> >processing model is a depth-first expansion that modifies the
> >state of the interpreter.
> >
> >In fact, my interest in Marpa comes from co-developing a rewrite
> >of TeX's parser called LaTeXML, which reorients the processing
> >towards the creation of XML documents and adds stronger support
> >for semantic macros. Looking at the processing model of TeX, the
> >one place where Marpa fits is for processing mathematical
> >expressions into ASTs which can then be serialized as XML. Of
> >course, you could use Marpa to support a simplification of TeX,
> >i.e. its standard macros and environments, ignoring all catcode
> >and programming-near logic. Many convenience tools do that, e.g.
> >MathJaX which parses a subset of LaTeX's predefined macros,
> >especially those used for writing math expressions.
> >
> >Greetings,
> >Deyan
> >
> >
> >On Wed, Apr 3, 2013 at 1:55 PM, Jeffrey Kegler
> ><jeffre...@jeffreykegler.com

Ron Savage

unread,
Apr 3, 2013, 10:47:00 PM4/3/13
to marpa-...@googlegroups.com
Hi Deyan

On 04/04/13 11:18, Deyan Ginev wrote:
> Hi Ron,
>
> On Wed, Apr 3, 2013 at 7:53 PM, Ron Savage<r...@savage.net.au> wrote:
>
>> [snip]
>>
>> Now I got myself excited about writing a Markdown parser in Marpa,
>>> hopefully I get some free time in the coming weeks to try that out (or
>>> even
>>> better - someone beats me to it!)
>>>
>>
>> Before you start with Markdown, note that it's (hopefully) obsoleted by
>> pandoc:
>>
>> http://johnmacfarlane.net/**pandoc/<http://johnmacfarlane.net/pandoc/>
>>
>> I've just started to use pandoc for articles:
>>
>> http://savage.net.au/jQuery-tutorials.html
>>
>>
> How would that render obsolete a Markdown parser written in Marpa? From

I was thinking of the syntax to be analyzed. Handling pandoc syntax,
rather than just Markdown, is my preferred target of any such
Marpa-based code.

> what I know Pandoc is a Haskell application, so that gives you Markdown
> support in Haskell, GitHub offers markdown support in Ruby, and I recall
> seeing a JavaScript Markdown renderer. I haven't seen one in Perl yet, and
> why not serve Markdown pages in e.g. Perl web applications using a Marpa
> renderer to HTML?
>
> I see this akin to the Marpa JSON parser that I've seen before in this
> space - each programming language has its own JSON library, Perl even has
> several. Similarly for markdown renderers. And I think it might be a pretty
> nice showcase.
>
> Cheers,
> Deyan
>

Peter Stuifzand

unread,
Apr 4, 2013, 3:43:13 AM4/4/13
to marpa-...@googlegroups.com
Re the Makefile parser: I dumped something in a gist sometime ago https://gist.github.com/pstuifzand/4448957

--
Peter Stuifzand | peterstuifzand.nl | @pstuifzand


--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscribe@googlegroups.com.

Peter Stuifzand

unread,
Apr 4, 2013, 3:52:21 AM4/4/13
to marpa-...@googlegroups.com
Also I was thinking about the Grok presentation by Steve Yegge[1]. He explains they asked compiler writers if they'd like to include debugging information to make their project possible. It seems Make also contains such debugging information [2]. The option seems to be "-qp" or "--print-data-base". The output includes implicit rules and other stuff make does. It could be helpful.


--
Peter Stuifzand | peterstuifzand.nl | @pstuifzand


Durand Jean-Damien

unread,
Apr 4, 2013, 4:16:23 PM4/4/13
to marpa-...@googlegroups.com
Code completion is a great idea -; I imagine well an eclipse plugin language independant -;

Ron Savage

unread,
Apr 4, 2013, 6:30:28 PM4/4/13
to marpa-...@googlegroups.com
Hi Peter

On 04/04/13 18:43, Peter Stuifzand wrote:
> Re the Makefile parser: I dumped something in a gist sometime ago
> https://gist.github.com/pstuifzand/4448957

Thanx.

A couple of years ago I looked at how to parse makefiles, with a view to
feeding a parser's output into GraphViz2 (1). I did not want to parse
makefiles myself, so I thought I could use the debug output of gmake,
but it's only a partial representation of the logic.

Also, there is the issue of env var expansion.

So I decided it was all too difficult :-(.

(1) https://metacpan.org/release/GraphViz2

Ron Savage

unread,
May 22, 2013, 9:13:14 PM5/22/13
to marpa-...@googlegroups.com
Hi

Perhaps a parser for vCard files or LDIF files is an idea?

Paul Bennett

unread,
May 23, 2013, 5:39:27 AM5/23/13
to marpa-...@googlegroups.com
On Wed, May 22, 2013 at 9:13 PM, Ron Savage <r...@savage.net.au> wrote:
> Hi
>
> Perhaps a parser for vCard files or LDIF files is an idea?

Based on my own "fun" while manually trying to make a .tmLanguage file
(syntax highlighting rules (and folding rules, and etc) for Textmate
and/or Sublime Text 2) for Marpa::R2::Scanless grammars, I may write a
little something to automatically convert them. I suspect I'd start by
requiring the "action" name to be the .tmLanguage "selector" name to
be used (TM selectors work a little bit like CSS selectors, or I guess
more like CSS class name combinations). Or, if I subclass Scanless.pm
properly, could I add a "selector" adverb right into the grammar?

I'll gladly take a stab at vCard and related formats. I've been
meaning to write a parser for those anyway, and Marpa is indeed the
modern parser-writer's weapon of choice. The trouble is that I'm a
great starter of projects, but a less-great finisher of them.


--
PWBENNETT @ CPAN / github
paul.w....@gmail.com @ Google Code

Ron Savage

unread,
May 23, 2013, 7:08:07 PM5/23/13
to marpa-...@googlegroups.com
Hi Paul

On 23/05/13 19:39, Paul Bennett wrote:
> On Wed, May 22, 2013 at 9:13 PM, Ron Savage<r...@savage.net.au> wrote:
>> Hi
>>
>> Perhaps a parser for vCard files or LDIF files is an idea?
>
> Based on my own "fun" while manually trying to make a .tmLanguage file

Hmm. never heard of '.tmLanguage' :-).

> (syntax highlighting rules (and folding rules, and etc) for Textmate
> and/or Sublime Text 2) for Marpa::R2::Scanless grammars, I may write a
> little something to automatically convert them. I suspect I'd start by
> requiring the "action" name to be the .tmLanguage "selector" name to
> be used (TM selectors work a little bit like CSS selectors, or I guess
> more like CSS class name combinations). Or, if I subclass Scanless.pm
> properly, could I add a "selector" adverb right into the grammar?
>
> I'll gladly take a stab at vCard and related formats. I've been
> meaning to write a parser for those anyway, and Marpa is indeed the
> modern parser-writer's weapon of choice. The trouble is that I'm a
> great starter of projects, but a less-great finisher of them.

Feel free to beat me to it, but I /am/ certainly heading in that
direction, and a way of teaching myself about the SLIF technique.

Paul Bennett

unread,
May 24, 2013, 5:33:55 AM5/24/13
to marpa-...@googlegroups.com
On Thu, May 23, 2013 at 7:08 PM, Ron Savage <r...@savage.net.au> wrote:
> Hi Paul

Hi! :-)

> On 23/05/13 19:39, Paul Bennett wrote:
>>
>> Based on my own "fun" while manually trying to make a .tmLanguage file
>
> Hmm. never heard of '.tmLanguage' :-).

It's a horrible, nasty, brutish-yet-subtle monster, constructed by
trying to use XML as a programming language instead of a markup
language.

It's possible to do this acceptably well (see also XSLT), but it's
also possible to do it incredibly badly (see also .tmLanguage). It
manages to be so bad, in part, thanks to being an application of the
Apple PList DTD, which brings its own set of weirdness to the party.

It's also not documented nearly as readably as I'd like it to be --
especially hard to find is a grammar for and discussion of it's notion
of "selectors", which are sort of like CSS selectors, but backwards:
if regexp $x matches (except it's seldom just a single regexp, more
usually several acting in collusion), then the text has selector $y.

Somewhere else in the bowels & brain of Textmate & ST2, there's a
mapping of selector syntax to things like color highlighting, and to
the stuff that lets you "jump to the definition of the 'thing' under
the cursor", and all that good stuff.

>> I'll gladly take a stab at vCard and related formats.
>
> Feel free to beat me to it, but I /am/ certainly heading in that direction,
> and a way of teaching myself about the SLIF technique.

Well, I don't want to tread on any toes, here. Have at it, and I'll
keep plugging at my seemingly-interminable list of projects in
progress.



--
Paul Bennett
PWBENNETT @ CPAN & github
paul.w....@gmail.com @ various Google things

Ron Savage

unread,
May 24, 2013, 6:59:15 AM5/24/13
to marpa-...@googlegroups.com
Hi Paul

On 24/05/13 19:33, Paul Bennett wrote:
> On Thu, May 23, 2013 at 7:08 PM, Ron Savage<r...@savage.net.au> wrote:

Thanx for the detail re .tmLanguage. In a word: Yuk!

My context is that I'm converting a set of my Perl modules from YUI to
jQuery, and currently I'm up to importing vCard files, and perhaps later
LDIF files. I currently have code which parses vCards - It's at the JS
level I'm stuck. But to replace all the XML processing with using
Marpa... That's tempting.

Jakub Narębski

unread,
May 29, 2013, 3:39:22 PM5/29/13
to marpa-...@googlegroups.com
On Wednesday, April 3, 2013 8:13:23 PM UTC+2, Deyan Ginev wrote:
Putting TeX in the same bunch with the rest is not really appropriate, as TeX is a Turing-complete programming language that actively interprets its input and modifies its behavior on demand. Using Marpa there would be a bigger hindrance than help, as the processing model is a depth-first expansion that modifies the state of the interpreter.

On the other hand if Marpa could be massaged to parse a subset of TeX / LaTeX and provide meaningful error messages it would be really nice.  Nowadays if there is an error in LaTeX document you can get a strange error many pages later, and then try to find where an error is (which might be lack of closing '}', or using some command in mode where it doesn't work, etc.).

Nb. TeX has more than 2 phases of parsing (lexing and parsing), and it can e.g. modify status of characters during parsing (e.g. `\makeatletter` which makes '@' category letter to be able to use e.g. `\@foo` command)...

Jakub Narębski

unread,
Jun 14, 2013, 4:57:39 PM6/14/13
to marpa-...@googlegroups.com
And there are such things like \def, \edef and \expandafter...

-- 
Jakub Narębski
 
Reply all
Reply to author
Forward
0 new messages