Using Rob's lexer to build a bibtex parser

Hein Meling

unread,

Sep 2, 2015, 7:51:44 PM9/2/15

to golang-nuts

Hi all,

I've been playing with Rob's lexer from way back (http://talks.golang.org/2011/lex/r59-lex.go), with the aim to adapt it so that I can build a bibtex parser (see http://www.bibtex.org/Format/).

I got some preliminary tests working, which makes me confident that I can do it. However, I wanted to ask if this approach is considered idiomatic Go these days?? I really like the approach, and I don't worry about performance if that ever was an issue.

My main concern is perhaps what should the "interface" to the parser be? There are no exported methods in the version linked above. Right now, I'm using the nextItem() method for testing, which seems reasonable. However, I'd appreciate any input on other APIs that more closely matches the current practice, if any. Perhaps along the lines of the text/scanner or json.Decoder??

Thanks,

:) Hein

Lars Seipel

unread,

Sep 2, 2015, 9:18:27 PM9/2/15

to Hein Meling, golang-nuts

On Wed, Sep 02, 2015 at 04:51:44PM -0700, Hein Meling wrote:
> I got some preliminary tests working, which makes me confident that I can
> do it. However, I wanted to ask if this approach is considered idiomatic Go
> these days?? I really like the approach, and I don't worry about
> performance if that ever was an issue.

When Rob held that talk, there was a restriction in Go that meant that a
goroutine started during initialization might not run to completion.
That restriction was lifted, so you can omit the presented workaround,
if you'd like so. The text/template package has a slightly revised
version of the lexer presented in the talk.

Aside from that, the talk contents are still as relevant as they were
back then. It's a very nice approach, I think.

Rob Pike

unread,

Sep 2, 2015, 9:30:51 PM9/2/15

to Lars Seipel, Hein Meling, golang-nuts

That talk was about a lexer, but the deeper purpose was to demonstrate how concurrency can make programs nice even without obvious parallelism in the problem. And like many such uses of concurrency, the code is pretty but not necessarily fast.

I think it's a fine approach to a lexer if you don't care about performance. It is significantly slower than some other approaches but is very easy to adapt. I used it in ivy, for example, but just so you know, I'm probably going to replace the one in ivy with a more traditional model to avoid some issues with the lexer accessing global state. You don't care about that for your application, I'm sure.

So: It's pretty and nice to work on, but you'd probably not choose that approach for a production compiler.

-rob

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hein Meling

unread,

Sep 10, 2015, 4:59:40 PM9/10/15

to golang-nuts, lars....@gmail.com, hein....@gmail.com

Thanks for the reply Rob and Lars.

I've played some more, and I have something that works fairly well (not feature complete, but almost; and there is at least one known bug). I've added a few helper methods in the lexer object, which makes writing the different lexing state functions quite easy.

I've not yet built the parser that I was planning (maybe next week). Anyway, I've uploaded the lexer here: https://github.com/meling/biblexer if it is of interest to someone else. And if someone has any feedback, I'd appreciate it.

Thanks,
:) Hein

Roberto Zanotto

unread,

Sep 10, 2015, 6:02:42 PM9/10/15

to golang-nuts, lars....@gmail.com, hein....@gmail.com

For the parser I'd suggest a recursive descent one (if the grammar is simple enough i.e. LL(1) or LL(2)).

Regarding the API, the lexer should take a Reader and give you a NextToken method, the parser should take a lexer and give you a syntax tree. You might want to keep the lexer (and types for representing tokens) private and just expose the parser, you might want to keep both private and just provide a command line application that does something useful, that depends on what your goal is.

Reply all

Reply to author

Forward