Grammar induction in Prolog

155 views
Skip to first unread message

Stassa Patsantzis

unread,
Jun 8, 2015, 7:26:38 AM6/8/15
to swi-p...@googlegroups.com
Hi list,

I'm looking for any resources on grammar induction (a.k.a. grammatical inference) in Prolog. I'm interested both in deriving grammar rules and their probabilities (I'm not only interested in stochastic grammars). Algorithm implementations, papers, books, websites- anything anyone might have in mind, it might be useful.

Also any related work, like chunkers or POS taggers might be helpful also.

I've been looking around and I've found quite a few resources on NLP in general, but not that much on grammar induction specifically. For reference I've searched the mailing list archives on gmane and looked at the most common textbooks (Pereira & Schreiber, Covington, Gazdar & Mellish etc) and also had a look at PRISM and Problog, so I'm looking for something beyond those sources.

Kind regards,
Stassa

Stassa Patsantzis

unread,
Jun 8, 2015, 7:34:03 AM6/8/15
to swi-p...@googlegroups.com
Oof. Apologies for the duplicate posts. Each time I got an error message saying the new topic had not been posted. Should have checked before hitting post again.

Michael Ben Yosef

unread,
Jun 9, 2015, 5:02:26 AM6/9/15
to swi-p...@googlegroups.com
Hi Stassa,

I'm definitely not the best person to answer this, but I've just finished the following recent and excellent book (see here), which you haven't mentioned:

Title: Language Processing with Perl and Prolog
Theories, Implementation, and Application
Series: Cognitive Technologies
Author: Nugues, Pierre M.
2nd ed. 2014, XXV, 662 p. 200 illus., 18 illus. in color, Hardcover
ISBN-13: 978-3-642-41464-0

That has great introductions to chunkers and POS taggers amongst many other topics, and it also gives the following reference:

Zelle, J. M., & Mooney, R. J. (1997). An inductive logic programming method for corpus-based parser construction. Technical note, University of Texas at Austin.

I see a lot of mentions of grammar induction (including probabilistic) in Luc De Raedt's Logical and Relational Learning so I imagine there is active research on this in ILP and now SRL, so perhaps that's where to look, but I'm not knowledgeable enough to suggest references.

Hope that helps a little!

Michael

Stephen Coda

unread,
Jun 9, 2015, 9:00:56 AM6/9/15
to swi-p...@googlegroups.com
Parsing... Book added to amazon wish list...

Stassa Patsantzis

unread,
Jun 9, 2015, 5:42:39 PM6/9/15
to swi-p...@googlegroups.com
Hi Michael,

Thank you for replying! Yes that's very helpful, cheers!

I'll have a look in De Raedt's book, I'm interested in ILP and SRL also.

Generally I'm looking for example implementations of GI (and more generally machine learning) algorithms in Prolog.

You're right about Nugues' book, I forgot to mention it. It's true that it has some great examples.

Although to be fair, I think it will go down in history for this one sentence, last of the third paragraph in the Preface:

"I chose to write examples in two languages to make the algorithms easy to understand and encode: Perl and Prolog"

(Emphasis mine :)

Kind regards,
Stassa

Eyal Dechter

unread,
Jun 9, 2015, 10:25:40 PM6/9/15
to Michael Ben Yosef, <swi-prolog@googlegroups.com>
Much of my research is on probabilistic grammar induction, and the truth is there isn't much out there, especially in Prolog. The vast majority of the research on grammar learning is on estimating the parameters of grammars whose rules are fixed and taken from one annotated corpus or another. Moreover, since the computational linguistics community has focused a lot on small increases in parsing accuracy and speed, people tend to roll their own optimized code, instead of using something more general, like Prolog. 

The paper I've attached uses PRISM to induce probabilistic context-sensitive grammars, but there are serious limits to that approach because PRISM does not allow any kind of approximate inference. More recently, I've been working on an alternative approach that I've been implementing in SWI prolog, but I do not yet have a working prototype. I'd love to hear what approach you are considering. 

Best,
Eyal Dechter
  

--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.
Visit this group at http://groups.google.com/group/swi-prolog.
For more options, visit https://groups.google.com/d/optout.

nips_workshop.pdf
cogsci.pdf

Stassa Patsantzis

unread,
Jun 11, 2015, 6:18:08 AM6/11/15
to swi-p...@googlegroups.com, mby...@gmail.com
Hi Deyal, thanks for replying!

I'm rolling my own code too, I'm just doing it in Prolog so I wanted to have a
look at some examples to save myself some time and trouble.

I had a (quick) look at the two papers you attached. The one about learning
numbers reminded me a bit of something I've been thinking about my project. I
want to learn a Controlled Natural Language for a collectible card game (Magic:
the Gathering). Eventually I'd like to develop a tournament-level AI player. The
game's CNL is used to give instructions to players but it's also used in the
game's manual and I think you can't really learn one without the other, the
language without the game and vice-versa. That's certainly true for human
players. 

For the time being I'm concentrating on deriving the CNL's grammar. My plan so
far is to hand-craft a partial grammar and derive a more complete one from the
language's corpus (the game's cards). I'm optimistic about this because my
target language has a fixed context and the length of strings is limited by the
size of the print area on playing cards.

The next step would be to try to learn a complete specification of the language
from the game's manual. Like I say the manual is also pretty much written in the
target CNL so I was hoping to kind of bootstrap the process with what I can
learn from the corpus. 

I've also thought about defining a rule engine (some kind of automaton) that I
could then use as an Oracle for the language learner, particularly to produce
counter-examples. Then again, I'd prefer to learn this rule engine from the
manual- you see why I say you can't really learn the game without learning the
language.  

I should say I haven't written much code yet, I've been reviewing the
bibliography and trying to understand what the state of the art is. It's a bit
confusing because people keep reporting Gold's result and Angluin and in the
same breath they report some success in learning some language up to N-CFG level
even. So my feeling is that basically there's a small ecosystem of heuristics
described on a billion papers that I'll never have time to read.

I'm interested in the alternative approach you mention. My email is ep294 at
sussex dot ac dot uk, please get in touch if you like. 

Kind regards,
Stassa

Andy Green

unread,
Dec 3, 2016, 5:28:16 PM12/3/16
to swi-p...@googlegroups.com, mby...@gmail.com
Constraint handling rule grammars can be used to implement grammar induction systems in Prolog.

Sam Neaves

unread,
Dec 4, 2016, 5:36:30 AM12/4/16
to Andy Green, SWI-Prolog, mby...@gmail.com

On Sat, Dec 3, 2016 at 10:28 PM, Andy Green <jar...@gmail.com> wrote:
Constraint handling rule grammars can be used to implement grammar induction systems in Prolog. These articles may be relevant.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/swi-prolog.
Reply all
Reply to author
Forward
0 new messages