[Boost-users] Comparison boost spirit and ANTLR

1,003 views
Skip to first unread message

Olivier Austina

unread,
May 27, 2013, 2:06:19 PM5/27/13
to boost...@lists.boost.org

Hi,

I would like to choose a parser.

I would like to use a parser for text processing (natural language text). Which parser is suited in this case. In general which are the benefit to use boost spirit instead of ANTLR or ANTLR instead of boost spirit. Thank you.

--
Regards
Austina

Igor R

unread,
May 27, 2013, 3:14:09 PM5/27/13
to boost...@lists.boost.org
I guess you'd get more answers on Spirit General ML:
https://lists.sourceforge.net/lists/listinfo/spirit-general
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Olivier Austina

unread,
May 27, 2013, 3:55:12 PM5/27/13
to boost...@lists.boost.org
Thank you the question is forwarded to the spirit ML.

Regards
Austina
2013/5/27 Igor R <boost...@gmail.com>



--
Regards
Austina

Mathias Gaunard

unread,
May 29, 2013, 8:32:08 AM5/29/13
to boost...@lists.boost.org
ANTLR is LR, Spirit is LL.
Spirit is embedded in C++, ANTLR is a separate preprocessor.

Spirit is slow to compile and isn't very efficient at runtime, but it's
fairly nice to use once you get used to it.

Bjorn Reese

unread,
May 29, 2013, 8:48:50 AM5/29/13
to boost...@lists.boost.org
On 05/27/2013 08:06 PM, Olivier Austina wrote:

> I would like to use a parser for text processing (natural language
> text). Which parser is suited in this case. In general which are the
> benefit to use boost spirit instead of ANTLR or ANTLR instead of boost
> spirit. Thank you.

If the text is really a natural language, then you may have difficulty
specifying a syntax. In this case it is more common to resort to machine
learning for natural language processing.

Igor R

unread,
May 29, 2013, 8:54:21 AM5/29/13
to boost...@lists.boost.org
> ANTLR is LR, Spirit is LL.
> Spirit is embedded in C++, ANTLR is a separate preprocessor.
>
> Spirit is slow to compile and isn't very efficient at runtime, but it's fairly nice to use once you get used to it.

As for runtime performance, it depends on the use-case.
http://alexott.blogspot.co.il/2010/01/boostspirit2-vs-atoi.html

salvatore dario minonne

unread,
May 29, 2013, 8:59:34 AM5/29/13
to boost...@lists.boost.org
ATLR is LL as well
--
SDM

Larry Evans

unread,
May 29, 2013, 10:51:01 AM5/29/13
to boost...@lists.boost.org
On 05/29/13 07:59, salvatore dario minonne wrote:
> ATLR is LL as well
More specifically, ANTLR is LL(k) for some k>0:

http://en.wikipedia.org/wiki/LL_parser

Spirit is a PEG:


http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/spirit/abstracts/parsing_expression_grammar.html

Both PEG and LL)k) are recursive descent; however, LL(k) means
the parser can lookahead K tokens to decide which alternative
to parse. I think spirit's expect operator means it can lookahead
1 token. Also, spirit will try alternatives in order until it
finds a match; whereas LL{k) will not try to parse each alternative
because it looks ahead k tokens to decide which alternative is
the only one possible(as mentioned in the LL_parser page mentioned
above).

Also, w.r.t. natural language parsing, wikipedia says that's
probably not a good idea ( as mentioned in the
parsing_expression_grammar.html page).

HTH.

>
>
> On Wed, May 29, 2013 at 2:32 PM, Mathias Gaunard
> <mathias...@ens-lyon.org <mailto:mathias...@ens-lyon.org>> wrote:
>
> On 27/05/13 20:06, Olivier Austina wrote:
>
>
> Hi,
>
> I would like to choose a parser.
>
> I would like to use a parser for text processing (natural language
> text). Which parser is suited in this case. In general which are the
> benefit to use boost spirit instead of ANTLR or ANTLR instead of
> boost
> spirit. Thank you.
>
>
> ANTLR is LR, Spirit is LL.
> Spirit is embedded in C++, ANTLR is a separate preprocessor.
>
> Spirit is slow to compile and isn't very efficient at runtime, but
> it's fairly nice to use once you get used to it.
>
> _________________________________________________
> Boost-users mailing list
> Boost...@lists.boost.org <mailto:Boost...@lists.boost.org>
> http://lists.boost.org/__mailman/listinfo.cgi/boost-__users
> <http://lists.boost.org/mailman/listinfo.cgi/boost-users>
>
>
>
>
> --
> SDM

Michael Caisse

unread,
May 29, 2013, 11:48:43 AM5/29/13
to boost...@lists.boost.org
On 05/29/2013 05:32 AM, Mathias Gaunard wrote:
>
> Spirit is slow to compile and isn't very efficient at runtime, but it's
> fairly nice to use once you get used to it.

Mathias,

"isn't very efficient at runtime" ... are you concerned about the
processing speed, or something else?


michael

--
Michael Caisse
ciere consulting
ciere.com

Evan Driscoll

unread,
May 29, 2013, 12:05:40 PM5/29/13
to boost...@lists.boost.org, Larry Evans
On 05/29/2013 09:51 AM, Larry Evans wrote:
> On 05/29/13 07:59, salvatore dario minonne wrote:
>> ATLR is LL as well
> More specifically, ANTLR is LL(k) for some k>0:
>
> http://en.wikipedia.org/wiki/LL_parser

Actually *that's* not true either. Antlr generates LL(*) parsers. With
some restrictions, the generated parsers have infinite lookahead.

For instance,

nonterm1 : (term1)* term2
(term1)* term3

is not LL(k) for any k as if you give me a k I give you the string
"term1^k term2" and you can't decide between those alternatives, but
that grammar is is LL(*).

I'm not sure how much this actually affects real usage, just trying to
be accurate. There is at least one example of something you may want to
do which is LL(*) but not LL(k)

classMember: modifier* type ident SEMI
| modifier* type ident LPAREN arglist RPAREN body

modifier: PUBLIC | PRIVATE | PROTECTED | STATIC | ...


(The description of what makes a grammar LL(*) requires more than I feel
like explaining now.)

Evan

salvatore dario minonne

unread,
May 29, 2013, 12:18:38 PM5/29/13
to boost...@lists.boost.org
I'd like to apologize for my previous email which was somehow laconic.
My goal was to stress the fact that ANTLR does not generate LR parser.

To add something to original question I'd like to add that TTBOMK the latest version of ANTLR (v4) doesn't generate C/C++ code and the v3 generates only C.

Hope it useful for Olivier

--
SDM

Larry Evans

unread,
May 29, 2013, 12:31:54 PM5/29/13
to boost...@lists.boost.org
On 05/29/13 09:51, Larry Evans wrote:
> On 05/29/13 07:59, salvatore dario minonne wrote:
>> ATLR is LL as well
> More specifically, ANTLR is LL(k) for some k>0:
>
> http://en.wikipedia.org/wiki/LL_parser
>
> Spirit is a PEG:
>
>
> http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/spirit/abstracts/parsing_expression_grammar.html
>
> Both PEG and LL)k) are recursive descent; however, LL(k) means
> the parser can lookahead K tokens to decide which alternative
> to parse. I think spirit's expect operator means it can lookahead
> 1 token. Also, spirit will try alternatives in order until it
> finds a match;
The parsing_expression_grammar.html contain a link to:

http://pdos.csail.mit.edu/~baford/packrat/popl04/

which contains a link to a .pdf and ps.gz file. My Adobe reader
could not read the.pdf file; however, my gv could read the unzipped
ps.gz file, whose page 1 supported the description given above for
how spirit handles the alternative operator, '|', which, on page 1 of
the ps file is called the "prioritized choice operator, '/'".
[snip]

HTH.

-Larry

Mathias Gaunard

unread,
May 30, 2013, 8:25:53 AM5/30/13
to boost...@lists.boost.org
On 29/05/13 14:59, salvatore dario minonne wrote:
> ATLR is LL as well
>

Sorry for the huge mistake, I must have confused it with other
preprocessors that are usually LR.

Mathias Gaunard

unread,
May 30, 2013, 8:27:36 AM5/30/13
to boost...@lists.boost.org
On 29/05/13 17:48, Michael Caisse wrote:
> On 05/29/2013 05:32 AM, Mathias Gaunard wrote:
>>
>> Spirit is slow to compile and isn't very efficient at runtime, but it's
>> fairly nice to use once you get used to it.
>
> Mathias,
>
> "isn't very efficient at runtime" ... are you concerned about the
> processing speed, or something else?

I'm concerned by the massive copying of attributes going on.
Building a deep AST will result in a huge number of copies.

Olivier Austina

unread,
Jun 1, 2013, 10:49:36 AM6/1/13
to boost...@lists.boost.org
Hi All,

Thank you very much for edifying reply.

Best regards
Olivier


2013/5/30 Mathias Gaunard <mathias...@ens-lyon.org>



--
Regards
Austina

Joel de Guzman

unread,
Jun 17, 2013, 3:13:38 AM6/17/13
to boost...@lists.boost.org
On 5/30/13 8:27 PM, Mathias Gaunard wrote:
> On 29/05/13 17:48, Michael Caisse wrote:
>> On 05/29/2013 05:32 AM, Mathias Gaunard wrote:
>>>
>>> Spirit is slow to compile and isn't very efficient at runtime, but it's
>>> fairly nice to use once you get used to it.
>>
>> Mathias,
>>
>> "isn't very efficient at runtime" ... are you concerned about the
>> processing speed, or something else?
>
> I'm concerned by the massive copying of attributes going on.
> Building a deep AST will result in a huge number of copies.


FWIW, Spirit is faster than ANTLR. People report Spirit to be 2.5 to 7 times
faster than ANTLR.

http://article.gmane.org/gmane.comp.parsers.spirit.general/16459/match=boost+spirit+antlr

http://article.gmane.org/gmane.comp.parsers.spirit.general/16463/match=spirit+antlr


Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/
Reply all
Reply to author
Forward
0 new messages