Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion parsing a bibtex file

Received: by 10.180.107.167 with SMTP id hd7mr1025230wib.0.1348449371600;
        Sun, 23 Sep 2012 18:16:11 -0700 (PDT)
X-FeedAbuse: http://nntpfeed.proxad.net/abuse.pl feeded by 88.191.116.97
Path: q11ni36707286wiw.1!nntp.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!nntpfeed.proxad.net!dedibox.gegeweb.org!gegeweb.eu!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail
From: Rui Maciel <rui.mac...@gmail.com>
Newsgroups: comp.unix.programmer
Subject: Re: parsing a bibtex file
Date: Mon, 24 Sep 2012 02:16:10 +0100
Organization: Aioe.org NNTP Server
Lines: 38
Message-ID: <k3oc8o$3j6$1@speranza.aioe.org>
References: <1348351577.3333.14.camel@roddur> <k3mebt$6of$1@news.albasani.net> <dee27232-7de1-4c8d-b74e-9f88d74e8e5d@googlegroups.com>
Reply-To: rui.mac...@gmail.com
NNTP-Posting-Host: kOOJJMfATi2f51TTu1lC7Q.user.speranza.aioe.org
Mime-Version: 1.0
X-Complaints-To: abuse@aioe.org
User-Agent: KNode/4.8.5
X-Notice: Filtered by postfilter v. 0.8.2
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7Bit

Rudra Banerjee wrote:

> Is it possible to achieve my goal using flex?

Flex generates lexers, which are routines that convert sequences of 
characters into sequences of tokens.  If you wish to write a parser then 
getting a hold of a sequence of tokens is half the battle; the other half is 
making a sense of that sequence of tokens.  For that you need another 
routine, called a parser, which builds upon the lexer.

The standard combo to develop parsers consists of relying on lex/flex to 
generate a lexer and yacc/bison to generate a parser that uses flex's lexer.  
There is a considerable amount of resources dedicated to this specific 
topic, both on the web and in good old dead tree format.

In spite of that, I believe that flex/bison does more harm than good, 
particularly when developing parsers for languages which are relatively 
simple.  It takes a bit of time to learn how to use it properly, and it will 
also cause a number of problems that may not be trivial to fix.  As an 
alternative, you can always write the parser yourself, without any fancy 
tool or code generator.  There are some disadvantages to that, such as 
taking a bit of work to write and debug, and ending up with a component 
which is harder to maintain.  Yet, the advantages often outweigh the 
disadvantages: you have much more control over what code goes into your 
parser, your parser tends to be considerably more efficient, you don't have 
to fiddle with your build system to support flex and bison, and once you 
know how to use it you will be able to develop parsers in any language other 
than C or C++.

In the case you decide to try to write your own bibtex parser by hand as a 
learning experience then you will do no wrong in looking up LL parsers. 
Wikipedia has an article on that topic which you might find interesting.

http://en.wikipedia.org/wiki/LL_parser


Hope this helps,
Rui Maciel