GF as a library - editing grammars in code

64 views
Skip to first unread message

Daniel Dudink

unread,
May 20, 2021, 5:31:16 AM5/20/21
to Grammatical Framework
Hi!

So I'm wanting to use GF in a bit of an unorthodox way. I don't know if it can even be done the way I envision it, so I figured instead of spending many more hours figuring out the GF source code I'd just ask here.

Basically, I would like a program that can, all programmatically:
1. read a gfo file.
2. make an edit (add a completely new word, change the word order in a particular construction, edit a word etc) without always having to recompile the entire grammar from scratch.
3. Use the edited grammar for translation and/or linearisation without first saving it to disk, while also being able to repeat step 2 at any time.
4. In the end save it as a gfo file again which can be fed back into step 1 at a later time.

I've considered just generating GF code as a string and compiling that, I've even done some test programs with that approach, but I think it would quickly grow very messy, and for large grammars it would probably take too long to compile the entire thing after each edit.

I've also looked at using the PGF API, but from what I can tell it can't do this, hence why I'm looking at the gfo format instead now. The pgf format and API will still be very useful for other parts of my project.

If this is doable, perhaps with a slightly different approach and/or an edit I can make within the GF source itself, I'd love some guidance in the right direction. I should add that I'm quite new to Haskell and GF, but this is a pet project and I'm willing to spend as much time as it takes.

Cheers,
Daniel

Krasimir Angelov

unread,
May 20, 2021, 6:03:58 AM5/20/21
to gf-...@googlegroups.com
Hi Daniel,

I have a similar situation but currently the way to go is to recompile the grammar. I have in my plans, however, to develop a new runtime which runs on top of a database. In other words the .pgf file itself will become an on disk database. There are two goals:

- to make it possible to recompile one rule at a time
- to make it possible to work with even bigger grammars.

Currently I can't add more languages on a server with a reasonable memory size and recomiling from scratch takes more than 30 minutes. Moving to a database is unfortunately not an easy task. Does anyone want to join forces?

Best Regards,
Krasimir


--

---
You received this message because you are subscribed to the Google Groups "Grammatical Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gf-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gf-dev/e830a27a-309e-42c4-89a0-76ada36cef2an%40googlegroups.com.

John J. Camilleri

unread,
May 22, 2021, 7:21:29 PM5/22/21
to gf-...@googlegroups.com
Hi Daniel,

What you need is to use the GF compiler library (PGF is the runtime library) here: https://hackage.haskell.org/package/gf-3.10/docs/GF.html
This is only available in Haskell.

John

On Thu, 20 May 2021 at 11:31, Daniel Dudink <d.g.d...@gmail.com> wrote:
--

John J. Camilleri

unread,
May 22, 2021, 7:21:34 PM5/22/21
to gf-...@googlegroups.com
Whoops, sorry for the brief reply. To do everything you want to, you will really need to use both the GF and PGF libraries together, but you will not be able to avoid recompiling the PGF for every change in the grammar with the currently available libraries.

John

John J. Camilleri

unread,
May 22, 2021, 7:21:38 PM5/22/21
to gf-...@googlegroups.com
I can only add that I share Krasimir's dream. This is a limitation of GF that we've come up against many times. There are some workarounds one can think of, such as writing grammars which produce placeholders and then performing placeholder-replacement outside of GF, but of course it's much better if this can be supported by GF natively. So while I share the vision, finding the resources to allocate to it is not easy.

John

Daniel

unread,
May 23, 2021, 5:46:17 AM5/23/21
to Grammatical Framework

Hi Krasimir and John, thank you very much for the replies,

I've just been considering how to proceed from there.

The database implementation seems very promising. I would love to help a little with that, in fact I fully intend to do more reading on how the PGF format is implemented, but I fear it might be a bit above my understanding at least for the time being. It also seems like it would be a multi-year job. I have some tentative plans to perhaps pursue a Master's in Gothenburg (in Language Technology) in a couple years, so if it's not done by then perhaps that would be an opportune moment for it.

That said, I would still like to somehow proceed. John's notion of switching around grammars is actually something I've considered, but when I first thought about it it seemed hard to work with. Now that I consider it again though...
I imagine the bulk of any large grammar would be the lexical entries. However, I don't know how much those actually add to the compilation time. If my naive understanding here is correct though, one could save a lot of compilation-time by making a grammar which duplicates all the syntax and morphology, but keeps only a bare minimum of words. Then the program would append to and recompile only that, merging it into the main grammar after editing finishes.

This seems doable and I think I'll try to go this route. I do have to ask if there's perhaps a more efficient way than just writing GF code through string manipulation and compiling the resulting file. If not, then that's what I'll do. I've done it that way before (with the GF compiler library, indeed) and it does work.

Daniel

John J. Camilleri

unread,
May 25, 2021, 3:19:40 PM5/25/21
to gf-...@googlegroups.com
Hi Daniel,

I imagine the bulk of any large grammar would be the lexical entries. However, I don't know how much those actually add to the compilation time. If my naive understanding here is correct though, one could save a lot of compilation-time by making a grammar which duplicates all the syntax and morphology, but keeps only a bare minimum of words. Then the program would append to and recompile only that, merging it into the main grammar after editing finishes.

So if I understand you correctly, you're talking here about splitting and merging on the PGF level. Merging PGFs is already done for combining different languages, but I don't know how easy it would be to split out only the lexicon from a PGF and combine it with another, ensuring there are no label clashes etc. I guess Krasimir will have a better idea, but since PGF is a compiled format my guess is it will be difficult.
 
I do have to ask if there's perhaps a more efficient way than just writing GF code through string manipulation and compiling the resulting file. If not, then that's what I'll do. I've done it that way before (with the GF compiler library, indeed) and it does work.

It sounds like you want to compile GF without using the compiler... which another way of saying you want to implement your own compiler :) Possibly for this limited case it might be more efficient computationally, but the human effort required to implement that would be significant. So I can't say this sounds like a good idea, unfortunately.

John
 

Daniel

unread,
May 25, 2021, 7:06:32 PM5/25/21
to Grammatical Framework
Hi John,

The main thing I want is actually rather simple. It's just to have a program that can write and edit GF grammars.

Whether I am manipulating source code, PGF files or something in-between doesn't in itself matter to me, I'll go with whatever's the easiest and cleanest. The only catch is that it can't take too long to compile after each small edit, hence my initial question.

When the program is in active use, and an edit to the grammar is made (say, the word order in a certain construction is changed) I want to have access to a compiled grammar as soon as possible which can translate and generate with the new changes. But it wouldn't need to have access to all lexical entries for this, only at least one of each base category (a noun for each noun gender, a verb for each declension etc).
When editing finishes, the program is free to compile everything and take its time doing so.

I also wasn't talking about anything as sophisticated as splitting or merging PGFs, I could simply generate two source-codes (one minimal, one full), merge those when needed, then compile them separately. I'm open to suggestions though. But no I'm not looking to write my own compiler or optimise at all costs, it just needs to work and be usable (and preferably readable and maintainable) and I'll be happy :)

Hopefully that clears some things up,
Daniel

John J. Camilleri

unread,
May 28, 2021, 2:57:39 AM5/28/21
to gf-...@googlegroups.com
Hi Daniel,

I think I understand your use case a bit better now. I think your approach makes sense, i.e. that you have a minimal lexicon which is used for compilation at "edit" time, and then the full compilation of the entire lexicon can happen "offline". This can be easily achieved by having your full lexicon in a separate module, and then just commenting out its import when in edit mode. Or you could have lexicon modules of the same name in different subfolders, e.g. mini/Lex.gf and full/Lex.gf, and then just use the --path flag when invoking GF to tell it where to look for the Lex module (also doable when invoking the GF compiler via the Haskell library).  Of course the compilation is not guaranteed to be fast (enough) even with a minimal lexicon, because rules which are slow to compile could come from anywhere, but this approach certainly should help.

John

Daniel

unread,
Jun 9, 2021, 2:16:50 AM6/9/21
to Grammatical Framework
Hi John,

Sorry for the late reply, I was seeing if perhaps Krasimir or someone else had anything to add.

Thanks for talking me through the method, this is how I'm proceeding then!

I'll still be hoping the database implementation comes to fruition though, as it seems very promising both for my use-case as well as furthering GF as a whole.

Cheers,
Daniel
Reply all
Reply to author
Forward
0 new messages