[Haskell-cafe] C++ Parser?

686 wyświetleń
Przejdź do pierwszej nieodczytanej wiadomości

Christopher Brown

nieprzeczytany,
24 sty 2012, 05:06:0024.01.2012
do haskel...@haskell.org
Hi,

I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?

Many thanks,
Chris.

_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Antoine Latter

nieprzeczytany,
24 sty 2012, 08:30:3224.01.2012
do Christopher Brown, haskel...@haskell.org
On Tue, Jan 24, 2012 at 4:06 AM, Christopher Brown
<cm...@st-andrews.ac.uk> wrote:
> Hi,
>
> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?
>

I'm not aware of one.

When it comes to parsing C++, I've always been a fan of this essay:
http://www.nobugs.org/developer/parsingcpp/

It's a hobbyist's tale of looking into parsing C++ and then an
explanation of why he gave up. It's older, so perhaps the
state-of-the-art has advanced since then.

Antoine

Hans Aberg

nieprzeczytany,
24 sty 2012, 08:54:4224.01.2012
do Christopher Brown, haskel...@haskell.org
On 24 Jan 2012, at 11:06, Christopher Brown wrote:

> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?

There is a yaccable grammar
http://www.parashift.com/c++-faq-lite/compiler-dependencies.html#faq-38.11

You might run it through a parser generator that outputs Haskell code.
http://www.haskell.org/haskellwiki/Applications_and_libraries/Compiler_tools

Hans

Jason Dagit

nieprzeczytany,
24 sty 2012, 09:40:1324.01.2012
do Christopher Brown, haskel...@haskell.org
On Tue, Jan 24, 2012 at 2:06 AM, Christopher Brown
<cm...@st-andrews.ac.uk> wrote:
> Hi,
>
> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?

I don't think one exists. I've heard it's quite difficult to get
template parsing working in an efficient manner.

My understanding is that "real" C++ compilers use the Edison Design
Group's parser: http://www.edg.com/index.php?location=c_frontend

For example, the Intel C++ compiler uses the edg front-end:
http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler

I thought even microsoft's compiler (which is surprisingly c++
compliant) uses it but I can't find details on google about that.

There is at least one open source project using it, rose, so it's not
unthinkingable to use it from Haskell: http://rosecompiler.org/

Rose has had working haskell bindings in the past but they have bit
rotted a bit. With rose you get support for much more than parsing
C++. You also get C and Fortran parsers as well as a fair bit of
static analysis. The downside is that rose is a big pile of C++
itself and is hard to compile on some platforms.

If you made a BSD3 licensed, fully functional, efficient C++ parser
that would be great. If you made it so that it preserves comments and
the input well enough to do source to source transformations
(unparsing) that would be very useful. I often wish I had rose
implemented in Haskell instead of C++.

Jason

Christopher Brown

nieprzeczytany,
24 sty 2012, 09:54:1724.01.2012
do Jason Dagit, haskel...@haskell.org
Hi Everyone,

Thanks for everyone's kind responses: very helpful so far!

I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool. Obviously I don't want to start writing one myself, hence I was asking if anyone new about an already existing implementation.

Rose looks interesting, I'll check that out, thanks!

Chris.

Jason Dagit

nieprzeczytany,
24 sty 2012, 10:16:3724.01.2012
do Christopher Brown, haskel...@haskell.org
On Tue, Jan 24, 2012 at 6:54 AM, Christopher Brown
<cm...@st-andrews.ac.uk> wrote:
> Hi Everyone,
>
> Thanks for everyone's kind responses: very helpful so far!
>
> I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool. Obviously I don't want to start writing one myself, hence I was asking if anyone new about an already existing implementation.
>
> Rose looks interesting, I'll check that out, thanks!

I did some more digging after sending my email. I didn't learn about
GLR parser when I was in school, but that seems to be what the cool
compilers use these days. Then I discovered that Happy supports GLR,
that is happy!

Next I found that GLR supposedly makes C++ parsing much easier than
LALR, "The reason I wrote Elkhound is to be able to write a C++
parser. The parser is called Elsa, and is included in the distribution
below." The elsa documentation should give you a flavor for what
needs to be done when making sense of C++:
http://scottmcpeak.com/elkhound/sources/elsa/index.html

NB: I don't think it's been seriously worked on since 2005 so I assume
it doesn't match the latest C++ spec.

The grammar that elsa parses is here, one warning is that it doesn't
reject all invalid programs (eg., it errs on the side of accepting too
much): http://scottmcpeak.com/elkhound/sources/elsa/cc.gr

I think the path of least resistance is pure rose without the haskell
support. Having said that, I think the most fun direction would be
converting the elsa grammar to happy. It's just that you'll have a
lot of work (read: testing, debugging, performance tuning, and then
adding vendor features) to do. One side benefit is that you'll know
much more about the intricacies of C++ when you're done than if you
use someone else's parser.

Christopher Brown

nieprzeczytany,
24 sty 2012, 11:40:5024.01.2012
do Jason Dagit, haskel...@haskell.org
Hi Jason,

Thanks very much for you thoughtful response.

I am intrigued about the Happy route: as I have never really used Happy before, am I right in thinking I could take the .gr grammar, feed it into Happy to generate a parser, or a template for a parser, and then go from there?

Chris.

Jason Dagit

nieprzeczytany,
24 sty 2012, 11:54:3024.01.2012
do Christopher Brown, haskel...@haskell.org
On Tue, Jan 24, 2012 at 8:40 AM, Christopher Brown
<cm...@st-andrews.ac.uk> wrote:
> Hi Jason,
>
> Thanks very much for you thoughtful response.
>
> I am intrigued about the Happy route: as I have never really used Happy before, am I right in thinking I could take the .gr grammar, feed it into Happy to generate a parser, or a template for a parser, and then go from there?

That's the basic idea although the details will be harder than that.
Happy is a parser generator (like Bison, Yacc, and ANTLR). Happy and
elsa will have very different syntax for their grammar definitions.
You could explore taking the elkhound source and instead of generating
C++ you could generate the input for happy, if that makes sense. A
translation by hand would probably be easiest.

I would highly recommend making a few toy parsers with Happy + Alex
(alex is like lex or flex) to get a feel for it before trying to use
the grammar from elsa.

A quick google search pointed me at these examples:
http://darcs.haskell.org/happy/examples/

Nathan Howell

nieprzeczytany,
24 sty 2012, 12:32:4824.01.2012
do Christopher Brown, haskel...@haskell.org
On Tue, Jan 24, 2012 at 2:06 AM, Christopher Brown <cm...@st-andrews.ac.uk> wrote:
I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?


The clang API is in C++ and will do just about everything you'd ever want to do with C/ObjC/C++ source.

-n

Stephen Tetley

nieprzeczytany,
24 sty 2012, 14:06:3024.01.2012
do Christopher Brown, haskel...@haskell.org
There is also the DMS from Ira Baxter's company Semantic Design's.
This is an industry proven refactoring framework that handles C++ as
well as other languages.

I think the Antlr C++ parser may have advanced since the article
Antoine Latter link to, but personally I'd run a mile before trying to
do any source transformation of C++ even if someone were waving a very
large cheque at me.

On 24 January 2012 14:54, Christopher Brown <cm...@st-andrews.ac.uk> wrote:
> Hi Everyone,
>
> Thanks for everyone's kind responses: very helpful so far!
>
> I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool.

_______________________________________________

David Laing

nieprzeczytany,
24 sty 2012, 23:12:3524.01.2012
do Stephen Tetley, haskel...@haskell.org
Hi all,

Just to add to the list - Qt Creator contains a pretty nice (and incremental) C++ parser.

Cheers,

Dave

Yin Wang

nieprzeczytany,
1 lut 2012, 15:42:331.02.2012
do Christopher Brown, haskel...@haskell.org
I have written a C++ parser in Scheme, with a Parsec-style parser
combinator library. It can parse a large portion of C++ and I use it
to do structural comparison between ASTs. I made some macros so that
the parser combinators look like the grammar itself.

It's code is at:

http://github.com/yinwang0/ydiff/blob/master/parse-cpp.ss

A demo of the parse tree based comparison tool is at:

http://www.cs.indiana.edu/~yw21/demos/d8-3404-d8-8424.html


The bit of information I can tell you about parsing C++:

- C++'s grammar is not that bad if you see the consistency in it.
Parsing a major portion of C++ is not hard. I made the parser in two
days. It can parse most of Google's V8 Javascript compiler code. I
just need to fix some corner cases later.

- It is better to delay semantic checks to a later stage. Don't put
those into the parser. Parse a larger language first, and then walk
the parse tree to eliminate semantically wrong programs.

- Don't try translating from the formal grammar or parser generator
files for C++. They contain years of bugs and patches and you will
probably be confused looking at them. I wrote the parser just by
looking at some example C++ programs.

Cheers,
    Yin

Jason Dagit

nieprzeczytany,
1 lut 2012, 16:07:081.02.2012
do Yin Wang, haskel...@haskell.org
On Wed, Feb 1, 2012 at 12:42 PM, Yin Wang <yinw...@gmail.com> wrote:
> I have written a C++ parser in Scheme, with a Parsec-style parser
> combinator library. It can parse a large portion of C++ and I use it
> to do structural comparison between ASTs. I made some macros so that
> the parser combinators look like the grammar itself.
>
> It's code is at:
>
> http://github.com/yinwang0/ydiff/blob/master/parse-cpp.ss
>
> A demo of the parse tree based comparison tool is at:
>
> http://www.cs.indiana.edu/~yw21/demos/d8-3404-d8-8424.html
>
>
> The bit of information I can tell you about parsing C++:

Thank you for the interesting response and example code (that I
haven't had a chance to look at yet). How much support do you have
for templates?

Jason

Yin Wang

nieprzeczytany,
1 lut 2012, 16:38:011.02.2012
do Jason Dagit, haskel...@haskell.org
I haven't dealt explicitly with templates. I treat them as type
parameters (element $type-parameter). I don't check that they have
been declared at all. As explained, these are semantic checks and
should be deferred until type checking stage ;-)


Cheers,
    Yin

Odpowiedz wszystkim
Odpowiedz autorowi
Przekaż
Nowe wiadomości: 0