I like this idea of puzzle pieces and about how to deduce the puzzle pieces from an input stream. That's pretty interesting. I'm not sure if you've come across "wave function collapse" for procedural generation but it has very explicit puzzle piece rules https://marian42.de/article/wfc/
Maybe there are some interesting experiments with that? To use some puzzle pieces to generate some images and then try to get a
learning algorithm to reconstruct the initial rules. I think you
mentioned encode-decode systems before which I really like as an
approach, it's a good way of generating training data.
Wow! Never heard of it, but yes, wow, that is an awesome idea! The above algo is actually an important innovation! It's not particularly deep or tricky, (it's a variant of a "greedy" algorithm) but it does overcome a serious difficulty/road-block in generation.
Let me reply to this email in two parts, one about language -- I will cc Amir, who works on Link Grammar, for the first part. (I should probably cc the mailing list...) and reply privately for the rest.
So - I have two different blobs of code for graph generation. One is in Link Grammar, written by Amir, and one is for "generic" jigsaw pieces, in https://github.com/opencog/generate
. Amir implemented the LG algo using a wild-card tweak of teh existing LG parsing algo. I use an odometer algorithm, which explores all possibilities. Of course, it suffers from the brutal effects of combinatorial explosion. I don't know how Amir does his thing, but it also struggles with combinatorial explosion.
Let me explain the wave-function collapse algo, modified for natural language generation. So: for natural language, we have a one-dimensional grid -- a sentence, with slots for each word. Part of the generation task is to say "I want a sentence with words A,B,C in it, and I want that sentence to be no more than 20 words long"
Given a partially generated string, it selects a word location to fill in: the magic is in selecting this slot. It computes the entropy for each slot, then picks the one with the lowest entropy. Explicitly, this is
S = - sum_i p_i log p_i
where p_i == probability of word-disjunct-pair p_i and -log p_i == cost. In link-grammar, a "word-disjunct pair" is a "jigsaw puzzle piece". The "cost" is something that LG uses to assign priority by mimicking entropy. The sum_i is a sum over all word-disjunct pairs that meet the current boundary condition i.e. can connect to the neighboring connectors.
Given all possible word locations in the sentence, it selects one location, with greedy algo: it picks the one with the smallest entropy. Having selected a slot to fill in, the "wave function collapse algo" then selects something (some word) for that slot, and then repeats.
I like this algo because it explicitly avoids a combinatoric explosion, and it also explicitly selects the lowest-cost sentences from the get-go. It can generate more than one sentences, by back-tracking, but it will do so by generating sentences in cost-increasing order, more or less. This makes it very nice for practical applications.
This sounds like a great way to generate sentences, until you start thinking about how to write the code, at which point, things get a little dicey. In LG, the constraints are non-local, unlike blocks-world, where all constraints are nearest-neighbor constraints. Also, unlike blocks-world, selecting a word can result in the creation of new spaces ("dark energy" for physics aficionados) -- for example, given the partial sentence
it can be extended as
| | |
LEFT-WALL this is
| | +--Em--+
| | | |
LEFT-WALL this * is
where the star is a new empty slot, which can be filled in with some emphasizing adverb: e.g. "this really is ..." or "this surely is ..." -- so unlike blocks-world, the size of the LG universe can grow.
There are some other subtle points. Suppose I want to generate a sentences with specific words in them, but I don't know the word order, yet. The words might be "Ben" "eats" and "pizza" but of course, one could generate: "the pizza was eaten by Ben" or "we saw Ben eating a pizza" or "he was eating a pizza, when we last saw Ben". Worse: we need lexical functions to provide the alternatives eats-eating-eaten. So there's lots of interesting stuff to explore.
None-the-less, the core idea, of assembling the jigsaw-puzzle pieces in such a way that one is working with the minimum-entropy locations first, is a fantastic idea. (Painfully obvious, in retrospect: when one assembles a real-life jigsaw puzzle, one does exactly this: one starts with edges, cause there aren't very many of them. Then one organizes by color, and does the ones with the smallest number of choices, first. And so on.)
I like it!
p.s. Amir, we should talk, probably off-list, about whether we can steal any aspect of the above, and slot it into the existing LG code base. For example, during counting, to maybe limit choices, or something. There are ... subtleties ...