Pattern Miner improvements

Ben Goertzel

unread,

Feb 10, 2017, 6:34:49 AM2/10/17

to opencog, Nil Geisweiller, Hedra Seid, Shujing Ke

Nil,

We have about 8 billion different uses for the Pattern Miner on the
immediate horizon....

I know you reviewed the code last year and had some thoughts about
what needs to be done to improve its ease of usability for various
applications....

Hedra has looked at the code too and run it on some example NLP-parse
outputs; she will be available to spend some time on this and is a
good C++ programmer.... For instance she could spend time starting
March (and maybe sometime before that) on making Scheme bindings for
the pattern miner...

Shujing might also be available to help on a part-time basis soon,
though I'm not clear on that...

Nil, it would be good if you could take a few days and

1) make a list of what you think should be done to improve pattern
miner usability

2) maybe make a first sketch of what you think a Scheme API for the
pattern miner should look like

This would provide valuable guidance for Hedra and/or Shujing to work on this...

thanks
Ben

--
Ben Goertzel, PhD
http://goertzel.org

“I tell my students, when you go to these meetings, see what direction
everyone is headed, so you can go in the opposite direction. Don’t
polish the brass on the bandwagon.” – V. S. Ramachandran

Shujing Ke

unread,

Feb 11, 2017, 11:50:52 AM2/11/17

to Ben Goertzel, opencog, Nil Geisweiller, Hedra Seid

Hi,

I am still a bit busy with my baby - my parents just left back to China.

If Nil gives some examples on how the Scheme API should be like, I will have a look at it and evaluate how much work it will be and if I have time to do it.

Thanks,

Shujing

Nil Geisweiller

unread,

Feb 11, 2017, 1:00:57 PM2/11/17

to Shujing Ke, Ben Goertzel, opencog, Nil Geisweiller, Hedra Seid

Hi,

I'm not familiar enough with the pattern matcher yet to suggest a Scheme
API. I do believe I know however the next steps to clean it up. So I'll
look into that first, create a github issue, then look into the API design.

In fact for the API, it would be good if we can have a group chat,
involving people who will use it in the future. Like having some
concrete usage examples would help me to know how the API should be.

Personally, I do would like to see an API that facilitates interaction
with the URE. I don't really have concrete examples, just an idea.

Nil

Ben Goertzel

unread,

Feb 11, 2017, 3:18:48 PM2/11/17

to Nil Geisweiller, Shujing Ke, opencog, Hedra Seid

Hi Nil,

> I'm not familiar enough with the pattern matcher yet to suggest a Scheme
> API. I do believe I know however the next steps to clean it up. So I'll look
> into that first, create a github issue, then look into the API design.

Thanks, that will be helpful!

> In fact for the API, it would be good if we can have a group chat, involving
> people who will use it in the future. Like having some concrete usage
> examples would help me to know how the API should be.

Maybe we can try, first of all, a wiki page for summarizing the
near-term examples for which we'd like to see pattern mining work...

E.g. on my radar for this year we have (in no particular order)

-- Mining of inference histories, for inference control

-- Mining of dialogue histories, for learning dialogue patterns (or
more generally, verbal/nonverbal interaction patterns)

-- Mining of sets of genomic datasets or medical patient records, to
find surprisingly common combinations of features

-- Mining of surprising combinations of visual features in the output
of a relatively "disentangled" deep NN (such as the
pyramid-of-InfoGANs that Ralf, Selameab, Tesfa, Yenat and I are
working on)

-- Mining of surprising combinations of semantic relationships, in the
R2L output of a large number of simple sentences read into Atomspace

-- Mining of surprising combinations of syntactic relationships, in an
Atomspace containing a set of syntactic relationships corresponding to
each word in the dictionary of a given language (to be done
iteratively within the language learning algorithm Linas is
implementing)

-- Mining of surprising (link-parser link combination,
Lojban-Atomese-output combination) pairs, in a corpus of (link parses,
Lojban-Atomese outputs) obtained from a parallel (English, Lojban)
corpus

> Personally, I do would like to see an API that facilitates interaction with
> the URE. I don't really have concrete examples, just an idea.

Hmmm...

Well one could say there is a rule

FindSignificantPatterns

whose input is

-- a template T restricting what kinds of patterns to look for

-- a GroundedSchemaNode containing the significance measure one wants to use

and whose output is, say, a SetLink containing the most significant
patterns found

This is a pretty computationally expensive inference rule though ;)

But maybe you're thinking of something else...

-- Ben

Ben Goertzel

unread,

Feb 11, 2017, 3:21:12 PM2/11/17

to Shujing Ke, opencog, Nil Geisweiller, Hedra Seid

Thanks a lot Shujing. Indeed babies are time-consuming, though delightful ;)

So Nil says he'll fairly soon write up his suggestions regarding PM
code improvements, and then after that write up his suggestions
regarding the Scheme API... So I guess after he has done that we can
discuss what parts of that you may be interested to do on what
time-frame...

--- ben

Shujing Ke

unread,

Feb 12, 2017, 6:36:02 AM2/12/17

to Ben Goertzel, opencog, Nil Geisweiller, Hedra Seid

The current pattern miner is suitable for relation mining, perhaps also suitable for nature language sort of things mining, but still not suitable for data involving numbers or structures likes biological or chemical data.

Nil Geisweiller

unread,

Feb 16, 2017, 9:10:25 AM2/16/17

to Ben Goertzel, Nil Geisweiller, Shujing Ke, opencog, Hedra Seid

Hi,

On 02/11/2017 10:18 PM, Ben Goertzel wrote:
>> I'm not familiar enough with the pattern matcher yet to suggest a Scheme
>> API. I do believe I know however the next steps to clean it up. So I'll look
>> into that first, create a github issue, then look into the API design.
>
> Thanks, that will be helpful!

https://github.com/opencog/opencog/issues/2615

it is far from being complete but it's a start, and probably enough to
keep someone busy for a while. Happy to provide more details if needed.

> Maybe we can try, first of all, a wiki page for summarizing the
> near-term examples for which we'd like to see pattern mining work...
>
> E.g. on my radar for this year we have (in no particular order)
>
> -- Mining of inference histories, for inference control
>
> -- Mining of dialogue histories, for learning dialogue patterns (or
> more generally, verbal/nonverbal interaction patterns)
>
> -- Mining of sets of genomic datasets or medical patient records, to
> find surprisingly common combinations of features
>
> -- Mining of surprising combinations of visual features in the output
> of a relatively "disentangled" deep NN (such as the
> pyramid-of-InfoGANs that Ralf, Selameab, Tesfa, Yenat and I are
> working on)
>
> -- Mining of surprising combinations of semantic relationships, in the
> R2L output of a large number of simple sentences read into Atomspace
>
> -- Mining of surprising combinations of syntactic relationships, in an
> Atomspace containing a set of syntactic relationships corresponding to
> each word in the dictionary of a given language (to be done
> iteratively within the language learning algorithm Linas is
> implementing)
>
> -- Mining of surprising (link-parser link combination,
> Lojban-Atomese-output combination) pairs, in a corpus of (link parses,
> Lojban-Atomese outputs) obtained from a parallel (English, Lojban)
> corpus

Thanks. I've added that to pattern miner wiki page
http://wiki.opencog.org/w/Pattern_miner#Use_cases

>
>> Personally, I do would like to see an API that facilitates interaction with
>> the URE. I don't really have concrete examples, just an idea.
>
> Hmmm...
>
> Well one could say there is a rule
>
> FindSignificantPatterns
>
> whose input is
>
> -- a template T restricting what kinds of patterns to look for
>
> -- a GroundedSchemaNode containing the significance measure one wants to use
>
> and whose output is, say, a SetLink containing the most significant
> patterns found
>
> This is a pretty computationally expensive inference rule though ;)

True, but so is any evidence-based/direct-computation rule producing
ImpliciationLinks and such. The thing is the pattern miners has a set of
constraints, that could perhaps be seen as preconditions by the BC, to
avoid generating meaningless candidates, such as those with too little
support, etc. Just an idea...

Nil

Reply all

Reply to author

Forward