Message from discussion
parsing a bibtex file
Received: by 10.180.107.167 with SMTP id hd7mr1025230wib.0.1348449371600;
Sun, 23 Sep 2012 18:16:11 -0700 (PDT)
X-FeedAbuse: http://nntpfeed.proxad.net/abuse.pl feeded by 88.191.116.97
Path: q11ni36707286wiw.1!nntp.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!nntpfeed.proxad.net!dedibox.gegeweb.org!gegeweb.eu!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail
From: Rui Maciel <rui.mac...@gmail.com>
Newsgroups: comp.unix.programmer
Subject: Re: parsing a bibtex file
Date: Mon, 24 Sep 2012 02:16:10 +0100
Organization: Aioe.org NNTP Server
Lines: 38
Message-ID: <k3oc8o$3j6$1@speranza.aioe.org>
References: <1348351577.3333.14.camel@roddur> <k3mebt$6of$1@news.albasani.net> <dee27232-7de1-4c8d-b74e-9f88d74e8e5d@googlegroups.com>
Reply-To: rui.mac...@gmail.com
NNTP-Posting-Host: kOOJJMfATi2f51TTu1lC7Q.user.speranza.aioe.org
Mime-Version: 1.0
X-Complaints-To: abuse@aioe.org
User-Agent: KNode/4.8.5
X-Notice: Filtered by postfilter v. 0.8.2
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7Bit
Rudra Banerjee wrote:
> Is it possible to achieve my goal using flex?
Flex generates lexers, which are routines that convert sequences of
characters into sequences of tokens. If you wish to write a parser then
getting a hold of a sequence of tokens is half the battle; the other half is
making a sense of that sequence of tokens. For that you need another
routine, called a parser, which builds upon the lexer.
The standard combo to develop parsers consists of relying on lex/flex to
generate a lexer and yacc/bison to generate a parser that uses flex's lexer.
There is a considerable amount of resources dedicated to this specific
topic, both on the web and in good old dead tree format.
In spite of that, I believe that flex/bison does more harm than good,
particularly when developing parsers for languages which are relatively
simple. It takes a bit of time to learn how to use it properly, and it will
also cause a number of problems that may not be trivial to fix. As an
alternative, you can always write the parser yourself, without any fancy
tool or code generator. There are some disadvantages to that, such as
taking a bit of work to write and debug, and ending up with a component
which is harder to maintain. Yet, the advantages often outweigh the
disadvantages: you have much more control over what code goes into your
parser, your parser tends to be considerably more efficient, you don't have
to fiddle with your build system to support flex and bison, and once you
know how to use it you will be able to develop parsers in any language other
than C or C++.
In the case you decide to try to write your own bibtex parser by hand as a
learning experience then you will do no wrong in looking up LL parsers.
Wikipedia has an article on that topic which you might find interesting.
http://en.wikipedia.org/wiki/LL_parser
Hope this helps,
Rui Maciel