golex

412 views
Skip to first unread message

Miek Gieben

unread,
Jan 18, 2011, 7:32:49 AM1/18/11
to Go List
Hello,

There is a goyacc (many thanks for that!), but there isn't a golex cmd. Are
there any plans to develop a golex?

I've looked around for other lexers (and grammars) in the standard package
tree and there is:
* ebnf - only lets you make and verify grammars. Getting to the
actual data parsed with the grammar is not implemented.
* scanner - could not get it to recognize words containing / or other
characters that have significance for Go

So I ended up creating my own lexer.

Kind regards,

--
Miek

signature.asc

nsf

unread,
Jan 18, 2011, 7:48:35 AM1/18/11
to golan...@googlegroups.com

I prefer using ragel for writing lexers, and in SVN version there is a
support for the Go programming language.

http://www.complang.org/ragel/

It's a very flexible state machine code generator, but maybe a bit
hard to use.. although it's well documented.


Other people prefer writing lexers by hand.

Miek Gieben

unread,
Jan 18, 2011, 9:34:11 AM1/18/11
to golan...@googlegroups.com
[ Quoting nsf in "Re: [go-nuts] golex"... ]

> I prefer using ragel for writing lexers, and in SVN version there is a
> support for the Go programming language.
>
> http://www.complang.org/ragel/
>
> It's a very flexible state machine code generator, but maybe a bit
> hard to use.. although it's well documented.

Didn't know about that one. Thanks for the pointer!

grtz,


--
Miek

signature.asc

Sugu Sougoumarane

unread,
Jan 18, 2011, 3:36:07 PM1/18/11
to golan...@googlegroups.com
I subconsciously dislike regular expressions. So, I gratuitously stole code from http://golang.org/pkg/go/token/ to write my own lexer (for an sql parser).

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAk01pOMACgkQJYuFzziA0PY59wCg2o5b/osYBVprcNsWaOCRkcAd
o/wAn2pQV1ApGDhUBMK9fVTjKN86YWA2
=jDC2
-----END PGP SIGNATURE-----


Miek Gieben

unread,
Jan 18, 2011, 3:43:40 PM1/18/11
to golan...@googlegroups.com
[ Quoting Sugu Sougoumarane in "Re: [go-nuts] golex"... ]

> I subconsciously dislike regular expressions. So, I gratuitously stole code
> from http://golang.org/pkg/go/token/ to write my own lexer (for an sql parser).

Hmm, that is another idea... but the DNS package is getting pretty big
already. I want to avoid adding a lexer to the mix.

grtz Miek

signature.asc

Russ Cox

unread,
Jan 19, 2011, 8:30:18 AM1/19/11
to golan...@googlegroups.com
What is so complex in the DNS package that you need a custom lexer?

Russ

Miek Gieben

unread,
Jan 19, 2011, 12:06:12 PM1/19/11
to golan...@googlegroups.com
[ Quoting Russ Cox in "Re: [go-nuts] golex"... ]

> What is so complex in the DNS package that you need a custom lexer?

Parsing the RFC 1035 zone file format.

grtz,

--
Miek

signature.asc

jnml

unread,
Jan 20, 2011, 5:05:54 AM1/20/11
to golang-nuts, ondre...@nic.cz
On 19 led, 18:06, Miek Gieben <m...@miek.nl> wrote:
Our company approved open sourcing of some in-house brewed Go code.
Before putting it elsewhere (google code, github, ...), I would like
to ask the dev team guys if they might be interested to take a look on
it, i.e. to figure out if anything of it is possibly worth to include
in the Go distribution.

Frankly, the code in it's present state is not IMO in a shape for such
inclusion, but perhaps can be seen like a first cut of it's problem
area, allowing for some review/refactor/improve cycling to make it
acceptable.

Excerpts from the godocs:
--------
Lexer is a package for generating actionless scanners (lexem
recognizers) at run time.

Scanners are defined by regular expressions and/or lexical grammars,
mapping between those definitions, token numeric identifiers and an
optional set of starting id sets, providing simmilar functionality as
switching start states in *nix LEX. The generated FSMs are Unicode
rune based and all unicode.Categories and unicode.Scripts are
supported by the regexp syntax using the \p{name} construct.
--------
The lex package provides support for a *nix (f)lex like tool on .l
sources. The syntax is similar to a subset of (f)lex, see also:
http://flex.sourceforge.net/manual/Format.html#Format
--------
Golex is a lex/flex like (i.e. not fully POSIX lex compatible)
utility. It renders .l formated data to Go source code.
--------

Note: 'lexer' - NFA and Unicode rune based, 'lex' and 'golex' - DFA
and byte.

If there is some interest in any part of the above, I would then
prepare a CL
to initiate the review process.

-jnml

Ben

unread,
Jan 20, 2011, 5:32:40 AM1/20/11
to golang-nuts
FWIW, I recently wrote a lexer for Go (using generous copying-and-
pasting from regexp.go) that works with goyacc. Example input:

/[ \t]/ { /* Skip blanks and tabs. */ }
/[0-9]*/ { lval.n,_ = strconv.Atoi(yylex.Text()); return NUM }
/./ { return int(yylex.Text()[0]) }
//
package main
import "strconv"
func main() {
yyParse(NewLexer(os.Stdin))
}

http://cs.stanford.edu/~blynn/nex/

-Ben

Miek Gieben

unread,
Jan 20, 2011, 5:47:24 AM1/20/11
to golang-nuts
[ Quoting Ben in "[go-nuts] Re: golex"... ]

> FWIW, I recently wrote a lexer for Go (using generous copying-and-
> pasting from regexp.go) that works with goyacc. Example input:
>
> /[ \t]/ { /* Skip blanks and tabs. */ }
> /[0-9]*/ { lval.n,_ = strconv.Atoi(yylex.Text()); return NUM }
> /./ { return int(yylex.Text()[0]) }
> //
> package main
> import "strconv"
> func main() {
> yyParse(NewLexer(os.Stdin))
> }
>
> http://cs.stanford.edu/~blynn/nex/

Apparently there is a need and the stuff in the standard Go packages
obviously isn't sufficient.

I'm very much in favor of adding a lexer, in addition to yacc, to the
standard lib of Go.

grtz Miek

signature.asc
Reply all
Reply to author
Forward
0 new messages