Status of C-runtime and majestic branch — parsing incomplete?

1 view
Skip to first unread message

Martin Gercke

unread,
5:39 AM (11 hours ago) 5:39 AM
to Grammatical Framework
Hi,
I am currently experimenting with GF-WordNet to set up an example which - parses a German sentence - removes ambiguity by selecting the "correct" abstract syntax/meaning - makes translation to other languages. e.g. "Die Frau singt" -> an abstract syntax tree such as PredVPS (DetCN (DetQuant DefArt NumSg) (UseN woman_1_N)) (MkVPS (TTAnt TPres ASimul) PPos (UseV sing_2_V)) — with woman_1_N and sing_2_V picked as the intended WordNet senses — which then linearizes to English "the woman sings", Spanish "la mujer canta", Italian "la donna canta".
For this I would love to use the C runtime: the Haskell runtime enumerates all trees, which even for simple sentences explode to >1.000 leaves. C runtime promises bounded n-best parsing which could help. Also, I wanted to see how I can boil down the tree given that I know the "correct" abstract meaning of the nouns/verbs/adjectives involved.
So I went looking and found the majestic branch by looking at an issue (#130, "PGF as a database"), which looked very promising. I built the whole stack on WSL/Debian:
- C runtime (`src/runtime/c`, libpgf),
- Python binding (`src/runtime/python`),
- gf compiler (gf-4.0.0),
and downloaded the robust German grammar as an NGF.
`bootNGF`, `readNGF` and `lookupMorpho` work great and are *instant*
(mmap) — really nice.
The problem: parsing crashes. I un-commented `Concr.parse` in the binding wrapping `pgf_parse` to try and see if I could get it to run. It segfaults *inside the runtime* — even on the tiny `Food`
example compiled by the majestic `gf` itself (so it's not a format mismatch). The crashes seem to be in the LR machinery (`PgfParser::shift` dereferencing an invalid `shift->seq`, and a
`÷0` in `Production::operator new` where `lin->res.size()==0`). `git log` on `parser.cxx` shows active work ("an experimental left-corner table maker", etc.), so it looks like the parser on `majestic` is still mid-rewrite / incomplete.
Could somebody give me some guidance?
1. What's the current status of the C runtime overall, and which branch is the "live" one? 2. Which (C-runtime) runs on cloud.grammaticalframework.org/robust?
3. How does `majestic` relate to the other C-runtime branches? We see several (`pgf2-complete`, `lpgf`/`lpgf-memo`/`lpgf-string`, `concrete-new`, `compact-pgf`, `c-runtime`, ...)
4. Is there a branch/commit where end-to-end *parsing* with the NGF format actually works or is it even planned to get the parser running on the `majestic` branch at some point?
5. Any guidance on getting the C runtime + NGF running for parsing (not just lookup/linearization) would be hugely helpful.
Thanks a lot for any pointers! Martin

Krasimir Angelov

unread,
7:30 AM (9 hours ago) 7:30 AM
to gf-...@googlegroups.com
Hi Martin,

The parser in the majestic branch still has bugs. I plan to get back to it at the end of the summer when I will have less distractions from the many other things happening during the normal semester. The runtime that runs cloud.grammaticalframework.org/robust is in the one from the majestic branch, but is an older version which doesn't include the parser. The branches `pgf2-complete`, `lpgf`/`lpgf-memo`/`lpgf-string`, `concrete-new`, `compact-pgf`, `c-runtime` are other experimental versions which as far as I know are not used.

If you compile GF from the main branch and then compile the C runtime in src/runtime/c + the bindings in src/runtime/haskell-bind or src/runtime/python, then you can use the older C runtime which includes a parser but doesn't allow dynamic changes in the grammar. This also means that you have to compile GF WordNet with the compiler from the main branch.

Best,
Krasimir

--

---
You received this message because you are subscribed to the Google Groups "Grammatical Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gf-dev+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gf-dev/b3c1649d-77a6-4c0a-95bc-917a9ed861fdn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages