HELP: Integrating PolyGen (random-text gen) into Evennia ... possible?

Tristano Ajmone

unread,

Feb 18, 2015, 7:33:42 AM2/18/15

to eve...@googlegroups.com

For a number of years I've been using a small app called PolyGen (available for Win, Linux & Mac):

PolyGen is a program for generating random sentences according to a grammar definition, that is following custom syntactical and lexical rules. It takes a text file as source program defining a grammar by means of BNF-like rules and executes it, eventually showing the result. Here a source program is a grammar definition, the execution consists in the exploration of such grammar by selecting a random path and the result is the sentence built on the way.

PolyGen help creating randomly patchworked texts (on any lenght) to auto-generate description which vary in time. Random productions are based on rules, so output text keeps track of consistency of gender and number when stitching together the various random outputs. It's very handy and it has been used in Muds to autogenerate descriptions, characters, but also code snippets, ecc. After my question I'll post more info and links about PolyGen (for those interested).

QUESTION...

My question is: I'd like to integrate polygen's output with Evennia, how? I haven't come across anything in EV docs so far that mentions restictions or proper ways to do it. Can I just put in my Envenia python script a link to an external python script which will stream out the produced text when invoked? Are there catches and caveats to implementing this? PolyGen is quite fast in outputting the text, so could I just have Ev on hold with a STRING = myscript.py approach (meaning the variable will accept as string value the output of my script) or is it better to setup some asynchronous method?

I see great potential for PolyGen in Evennia.

as an example, at this link there is a source file of an Italian Mud (Maciste) which relied on PolyGen in it's hard-code:

ftp://fantasiadomain.com/lod/sorgenti_maciste/old/src/polygen.c

Thanks for any help you can give me to integrate it into Evennia.

Tristano (Italy)

MORE INFO ON POLYGEN

The program was created by an Italian professor. It's written in OCaml . This is the download page (Win, Linux & Mac) of the official website:

http://www.polygen.org/it/download

The site is no longer mantained (PolyGen is stable and finale), but there is a FaceBook group for PolyGen:

https://www.facebook.com/groups/35567209510/

There are a few example grammars in English, but because the app is italian it hasn't spead so much in the English world.

The English manual can be found inside this zipped archive:

http://archive.ubuntu.com/ubuntu/pool/universe/p/polygen/polygen_1.0.6.ds2.orig.tar.gz

or individually, as a single HTML source:

http://apt-browse.org/browse/ubuntu/trusty/universe/all/polygen/1.0.6.ds2-13build1/file//usr/share/doc/polygen/HOWTO-Refman.en.html/

Some links of interest related to PolyGen:

A FEW NOTES ON POLYGEN

PolyGen has his own (OCaml-like) syntax for writing text-grammars. There is no "#include" function in it, so to stitch-up multiple grammar templates I have to use some pipelining scripts. Basically, as an example, I've built a PolyGen grammar that outputs fantasy character names, another one names of places, ecc, so when I create a grammar to output a specific description for a context and I need to use the names of character and places, I create a script which pipes all thre of them and feeds them to PolyGen, so that all definitions will be reachable to the program. An "#include" function would have helped, but there isn't one...

So far I've used it to customize email-templates signature and quotations, and even on a local Apache server for HTML pages code randomization.

Griatch Art

unread,

Feb 18, 2015, 7:57:23 AM2/18/15

to eve...@googlegroups.com

You can integrate your program using Python's subprocess module; something like this:

import subprocess
string = subprocess.check_output(["polygen", "args", "for, "polygen"])

See subprocess.check_output in the Python standard library for details.

Please note though that this is a blocking operation. Evennia (yes, all of Evennia, all users everywhere) will wait for this call to return. This will most likely be noticeable. So if you want to do it right, it's better to issue this call asynchronously using a threaded deferred:

from twisted.threads import deferToThread
def _callback(string):
"called when polygen returns, whenever it does"
# deal with string here
# use a thread to launch and monitor the subprocess
deferred = deferToThread(subprocess.check_output, ["polygen", "args", "for, "polygen"])
# assign the callback for when it returns
deferred.addCallback(_callback)

Explaining asynchronous programming is a little outside the scope of this post, but in principle what happens here is that we call polygen using a thread that sits and watches the I/O until it returns (however long that will take). This means that until polygen returns with a result, Evennia is free to continue on its business. Once that happens, the _callback() function you specify is called with the result. The drawback here is that you cannot make use of the return right away in your code - asynchronous operation requires some different thinking than normal, sequential programming.
.
Griatch

Tristano Ajmone

unread,

Feb 18, 2015, 8:28:27 AM2/18/15

to eve...@googlegroups.com

Thanks a lot! This is what I needed.

I expected the blocking-issue. I'll look deeper into the asynchronous approach--if I'm not wrong, I've seen some code of yours in the examples that deals with threads pooling. Might find some inspiration there.

What I had in mind for PolyGen pertains character-generation, rooms procedural-generation random-descriptions, direction-giving talking NPCs, and other things which might occur just once (as in procedural generation) or that can wait a few seconds for the output since the output is the final part of the interaction (so even if it comes after the script has quitted it's ok).

For other uses, I could consider memoization of output. PolyGen has an optional parameter for random-seed inputs: same seed same output. I could limit the range of seeds (otherwise we're dealing with a big numbers range, which could lead to memoization eating up the DB or the filesystem) and memoize output on the filesystem. It might not be worth though, PolyGen is very fast, OCaml is a great language for this type of work. So, unless there are going to be hundreds of concurrent PolyGen invocations, it shoud cloack the system.

I'll use it sparingly anyway. I see it as a great aid in procedural generation, it could be used to pattern-randomize world-building, from map to descriptions, to populating rooms with objects. But this would be a one-off sequence of invocations, it's output being permanent. In this context, it could even be used easily outside of Evenia, creating a grammar that produces files to feed to the Batch Command/Code Processor -- as a fact, it seems that spammers have been using PolyGen to create html pages/email that vary in code as well as contents. Having tested it with HTML code generation, I know that it's a flexible tool for procedurale generation. It's grammar is not too user friendly though, it took me a good day to get confident with it--with all that it's well documented.

eartsar

unread,

Feb 19, 2015, 12:15:23 AM2/19/15

to eve...@googlegroups.com

Hi, Tristano.

I admittedly haven't looked at PolyGen too closely, but NLG through a CFG is actually something I brought up on the IRC the other day. What benefit does PolyGen bring to the table that NLTK's CFG api does not?

http://www.nltk.org/book/ch08.html

The syntax is similar-ish, and being that nltk is pure python, integration is trivial. If you did have a grammar for PolyGen to begin with, would porting a grammar over make more sense?

Tristano Ajmone

unread,

Feb 19, 2015, 6:29:36 PM2/19/15

to eve...@googlegroups.com

I admittedly haven't looked at PolyGen too closely, but NLG through a CFG is actually something I brought up on the IRC the other day. What benefit does PolyGen bring to the table that NLTK's CFG api does not?

http://www.nltk.org/book/ch08.html

The syntax is similar-ish, and being that nltk is pure python, integration is trivial. If you did have a grammar for PolyGen to begin with, would porting a grammar over make more sense?

Thanks, this is a great link that I didn't know anything about. I had a look at the examples and so far it seems to employ a very similar grammar system indeed. I didn't manage to verify some details (like adding + or - to increment chances of one entry being chose over others, ecc.)

As a fact, just for NLTK being in Python it makes not only a better choice for Evennia, but also in general--as I said, PolyGen doesn't allow easily to mix-match across grammars except through script piping. Having said this, I surely will look into NLTK more closely.

Advantages of Polygen could be: is a standalone tool with GUI--so, for Windows users it's easier to approach, at least those who would not install python for a reason or another. But it really depends on what one has in mind to do. I was also thinking of a tool external to Evennia, which would produce a batch script for populating a large map, which would be then fed into Evennia, but along the batch script it could also produce a graphic map for navigation. In this case I might consider using PolyGen.

As for integration into Evennia, if NLTK does all that PolyGen does, I would go for NLTK. Polygen is aimed solely at random productions. I am not yet sure if it makes it easier to use or not--after all "easy" is a big word in this context: a full understanding of these grammars is something which usually befits only people with some grounding in coding.

From what I saw so far it seems clear that grammars conversion to and from these two programs shouldn't be an issue--so any grammar created for one of them will likely be usable by the other with slight changes. But I think that the presence of Python allows some type of productions that PolyGen can't handle: PolyGen doesn't keep a "memory-record" of previous choices; so for example, once the dies are cast and a choice is made between Femminine and Masculine gender, all productions related to this gender have to be cascaded at once (you can mix them in the final order of output, but it makes things a bit entangled in long productions). it is though possibile to use fragmented calls to PolyGen via external scripts to solve this problem---but this in Python would not require any "outsourcing".

I will dig into it and see if a converter of grammars could be easily made available--this could be the best solution becuause PolyGen GUI is much more convenient (on Windows!) to use when it comes to creating and editing grammars. Final grammars could then be exported and polished on NLTK.