Looking into contributing to Parallelizing OpenCog Pattern Matcher.

60 views
Skip to first unread message

Parth Thakkar

unread,
Jan 24, 2018, 1:06:59 AM1/24/18
to opencog
I was going through the suggested projects page, and found this page: (https://wiki.opencog.org/w/Parallelizing_OpenCog). I'd like to know if it is still in scope.

Nil Geisweiller

unread,
Jan 24, 2018, 3:46:56 AM1/24/18
to ope...@googlegroups.com, Parth Thakkar
Regarding multi-threaded pattern matcher, I've always wondered, why not
just wrap the following loops

https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L393
https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L634
https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L821

in omp algos (see
https://github.com/opencog/cogutil/blob/master/opencog/util/oc_omp.h)
since most of the time (always?) `found` is set to false anyway.

By setting min_n appropriately (see
https://github.com/opencog/cogutil/blob/master/opencog/util/oc_omp.h#L64)
we'd avoid unnecessary multi-threading overhead.

All we'd need to have this working would be to make sure that the PM
code is multi-threaded safe (maybe easier said than done).

What do you guys think?

Nil
> --
> You received this message because you are subscribed to the Google
> Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to opencog+u...@googlegroups.com
> <mailto:opencog+u...@googlegroups.com>.
> To post to this group, send email to ope...@googlegroups.com
> <mailto:ope...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/4feba3cc-72bd-42fc-ab13-08fef4d81b8d%40googlegroups.com
> <https://groups.google.com/d/msgid/opencog/4feba3cc-72bd-42fc-ab13-08fef4d81b8d%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Ben Goertzel

unread,
Jan 24, 2018, 4:00:14 AM1/24/18
to opencog, Parth Thakkar
Hmm, interesting

So if I understand right, this is just sort of like letting the PM run
independently on N processors, exploring different parts of the
hypergraph...

Not as fancy as a parallel backtracker, but might work almost as well
in most common cases... I'd need to think/study more ...
> email to opencog+u...@googlegroups.com.
> To post to this group, send email to ope...@googlegroups.com.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/c143005d-d924-daec-5961-359bdd03b253%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.



--
Ben Goertzel, PhD
http://goertzel.org

"In the province of the mind, what one believes to be true is true or
becomes true, within certain limits to be found experientially and
experimentally. These limits are further beliefs to be transcended. In
the mind, there are no limits.... In the province of connected minds,
what the network believes to be true, either is true or becomes true
within certain limits to be found experientially and experimentally.
These limits are further beliefs to be transcended. In the network's
mind there are no limits." -- John Lilly

Nil Geisweiller

unread,
Jan 24, 2018, 6:11:31 AM1/24/18
to ope...@googlegroups.com, Ben Goertzel, Parth Thakkar
On 01/24/2018 11:00 AM, Ben Goertzel wrote:
> Hmm, interesting
>
> So if I understand right, this is just sort of like letting the PM run
> independently on N processors, exploring different parts of the
> hypergraph...

That's the idea. It seems even with large variance in resources needed
for checking if a candidate is a match, it would still be OK as OpenMP
is supposed to support dynamic scheduling. Quoting from
http://pages.tacc.utexas.edu/~eijkhout/pcse/html/omp-loop.html

"Dynamic schedules are a good idea if iterations take an unpredictable
amount of time, so that load balancing is needed"

Looking at __gnu_parallel code it seems static and dynamic looping are
calling the same primitive however :\

We've have to try to really know. Anyway, most of the work would go into
making the PM multi-thread safe, which, for the most part, will be
needed for parallel backtracking.

Nil

Linas Vepstas

unread,
Jan 25, 2018, 4:37:28 PM1/25/18
to opencog, Parth Thakkar
On Wed, Jan 24, 2018 at 3:00 AM, Ben Goertzel <b...@goertzel.org> wrote:
Hmm, interesting

So if I understand right, this is just sort of like letting the PM run
independently on N processors, exploring different parts of the
hypergraph...

Not as fancy as a parallel backtracker, but might work almost as well
in most common cases... I'd need to think/study more ...

In my experience, the greatest amount of performance gain and efficiency is gained not from micro-parallelizing small things, but running big things in parallel.   Why?

1) creating things like fine-grained backtracking takes a lot of engineering effort - especially measuring the system to see what it is actually doing.   If you have that kind of talent, there are more important problems it can be applied to.

2) the actual gains tend to be small, because of Amdahl's law.

3) thread-management has a non-zero and significant amount of overhead.  It can swamp your problem, if the problem is small enough.

And finally, the king:

4) Often, a fundamental system redesign can offer far greater performance advantages as compared to parallelizing or tuning the current system.  However, this is "it depends" and is often too nebulous to do anything about.

For the pattern matcher, right now, my best and brightest idea is here:  https://github.com/opencog/atomspace/issues/1502

It describes an "AtomSpaceLink" that describes the membership of an atom in a particular atomspace. Such membership could be (should be) used to limit the search space, and to generally group atoms into "sets".  Well, I guess this has little to do with parallelizing, other than that it has the power to control the size of the "problem space" or the "scope", thus providing a different way of controlling the performance of the system (i.e. by limiting scope).

--linas
 

>> To post to this group, send email to ope...@googlegroups.com

>> Visit this group at https://groups.google.com/group/opencog.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/opencog/4feba3cc-72bd-42fc-ab13-08fef4d81b8d%40googlegroups.com
>> <https://groups.google.com/d/msgid/opencog/4feba3cc-72bd-42fc-ab13-08fef4d81b8d%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to ope...@googlegroups.com.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/c143005d-d924-daec-5961-359bdd03b253%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.



--
Ben Goertzel, PhD
http://goertzel.org

"In the province of the mind, what one believes to be true is true or
becomes true, within certain limits to be found experientially and
experimentally. These limits are further beliefs to be transcended. In
the mind, there are no limits.... In the province of connected minds,
what the network believes to be true, either is true or becomes true
within certain limits to be found experientially and experimentally.
These limits are further beliefs to be transcended. In the network's
mind there are no limits." -- John Lilly
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscribe@googlegroups.com.

To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.

For more options, visit https://groups.google.com/d/optout.



--
cassette tapes - analog TV - film cameras - you

Linas Vepstas

unread,
Jan 25, 2018, 4:46:37 PM1/25/18
to opencog, Ben Goertzel, Parth Thakkar
On Wed, Jan 24, 2018 at 5:11 AM, 'Nil Geisweiller' via opencog <ope...@googlegroups.com> wrote:


We've have to try to really know.

I'm not against the idea. InitiateSearchCB really is the right place for this, and its "low-hanging fruit" for such an effort.

Step one is to assemble a several "reasonable" or "typical" datasets, along with "typical" queries performed on them.
Step two is to measure the actual performance, and profiling (seeing where the cycles are spent).

I am (strongly) against just running around and blindly slapping  omp::parallel_for in various random places. Decades-long experience in zillions of cases proves this is in general a performance de-accellerator, bug-inducer, and all-around fuck-with-it bad idea.
 
Anyway, most of the work would go into making the PM multi-thread safe, which, for the most part, will be needed for parallel backtracking.

Perhaps I'm crazy, but I believe the PM is already thread-safe. This tends to be the general coding principle used throught the code. It should be good to go, and blandly slapping in a omp::parallel_for should "just plain work".

--linas
 

Nil



To post to this group, send email to ope...@googlegroups.com


--
You received this message because you are subscribed to the Google Groups
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an

To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit
https://groups.google.com/d/msgid/opencog/c143005d-d924-daec-5961-359bdd03b253%40gmail.com.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscribe@googlegroups.com.

To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.

For more options, visit https://groups.google.com/d/optout.



--

Nil Geisweiller

unread,
Jan 26, 2018, 2:38:25 AM1/26/18
to ope...@googlegroups.com, Linas Vepstas, Ben Goertzel, Parth Thakkar
On 01/25/2018 11:46 PM, Linas Vepstas wrote:
> Step one is to assemble a several "reasonable" or "typical" datasets,
> along with "typical" queries performed on them.
> Step two is to measure the actual performance, and profiling (seeing
> where the cycles are spent).
>
> I am (strongly) against just running around and blindly slapping
> omp::parallel_for in various random places. Decades-long experience in
> zillions of cases proves this is in general a performance
> de-accellerator, bug-inducer, and all-around fuck-with-it bad idea.

I agree.

Regarding the datasets/queries. Some URE and pattern miner unit tests or
examples would suffice as far as *I* am concerned. Some of them take a
long time (like the MOSES-PLN synergy forward chaining test that takes a
few hours) and I can arbitrarily inflate the difficulty of some pattern
miner tests as well.

I wonder were such a benchmark script should be, under

https://github.com/opencog/atomspace/tree/master/opencog/benchmark

https://github.com/opencog/external-tools

or maybe in its own repository?

Or perhaps beside the <ATOMSPACE_REPO/tests directory, being treated as
some sort of alternate unit tests dedicated to benchmarking, living
under <ATOMSPACE_REPO>/benchmark that would be invoked with

make benchmark

?

> Perhaps I'm crazy, but I believe the PM is already thread-safe. This
> tends to be the general coding principle used throught the code. It
> should be good to go, and blandly slapping in a omp::parallel_for should
> "just plain work".

Cool! It will probably uncover some holes, I suppose, but that looks
like a good start.

Nil

>
> --linas
>
>
> Nil
>
>
>
> Not as fancy as a parallel backtracker, but might work almost as
> well
> in most common cases... I'd need to think/study more ...
>
>
>
>
> On Wed, Jan 24, 2018 at 4:46 PM, 'Nil Geisweiller' via opencog
> <ope...@googlegroups.com <mailto:ope...@googlegroups.com>> wrote:
>
> Regarding multi-threaded pattern matcher, I've always
> wondered, why not just
> wrap the following loops
>
> https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L393
> <https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L393>
> https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L634
> <https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L634>
> https://github.com/opencog/atomspace/blob/master/opencog/query/InitiateSearchCB.cc#L821
> <https://github.com/opencog/cogutil/blob/master/opencog/util/oc_omp.h>)
> since
> most of the time (always?) `found` is set to false anyway.
>
> By setting min_n appropriately (see
> https://github.com/opencog/cogutil/blob/master/opencog/util/oc_omp.h#L64
> <https://github.com/opencog/cogutil/blob/master/opencog/util/oc_omp.h#L64>)
> we'd avoid unnecessary multi-threading overhead.
>
> All we'd need to have this working would be to make sure
> that the PM code is
> multi-threaded safe (maybe easier said than done).
>
> What do you guys think?
>
> Nil
>
> On 01/24/2018 08:06 AM, Parth Thakkar wrote:
>
>
> I was going through the suggested projects page, and
> found this page:
> (https://wiki.opencog.org/w/Parallelizing_OpenCog
> <https://wiki.opencog.org/w/Parallelizing_OpenCog>). I'd
> like to know if it
> is still in scope.
>
> --
> You received this message because you are subscribed to
> the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an
> email to opencog+u...@googlegroups.com
> <mailto:opencog%2Bunsu...@googlegroups.com>
> <mailto:opencog+u...@googlegroups.com
> <mailto:opencog%2Bunsu...@googlegroups.com>>.
> To post to this group, send email to
> ope...@googlegroups.com <mailto:ope...@googlegroups.com>
> <mailto:ope...@googlegroups.com
> <mailto:ope...@googlegroups.com>>.
> <https://groups.google.com/group/opencog>.
> <https://groups.google.com/d/msgid/opencog/4feba3cc-72bd-42fc-ab13-08fef4d81b8d%40googlegroups.com?utm_medium=email&utm_source=footer
> <https://groups.google.com/d/optout>.
>
>
>
> --
> You received this message because you are subscribed to the
> Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an
> email to opencog+u...@googlegroups.com
> <mailto:opencog%2Bunsu...@googlegroups.com>.
> To post to this group, send email to
> ope...@googlegroups.com <mailto:ope...@googlegroups.com>.
> <https://groups.google.com/group/opencog>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/c143005d-d924-daec-5961-359bdd03b253%40gmail.com
> <https://groups.google.com/d/msgid/opencog/c143005d-d924-daec-5961-359bdd03b253%40gmail.com>.
>
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to opencog+u...@googlegroups.com
> <mailto:opencog%2Bunsu...@googlegroups.com>.
> To post to this group, send email to ope...@googlegroups.com
> <mailto:ope...@googlegroups.com>.
> <https://groups.google.com/group/opencog>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/14d956d0-3898-cc7b-2191-779ccb8c0aeb%40gmail.com
> <https://groups.google.com/d/msgid/opencog/14d956d0-3898-cc7b-2191-779ccb8c0aeb%40gmail.com>.
>
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
>
>
> --
> cassette tapes - analog TV - film cameras - you
>
> --
> You received this message because you are subscribed to the Google
> Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to opencog+u...@googlegroups.com
> <mailto:opencog+u...@googlegroups.com>.
> To post to this group, send email to ope...@googlegroups.com
> <mailto:ope...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/CAHrUA352FhH7VqtNq0xS1sffnHbqstHGO7cixBzmjqNhK_VUjA%40mail.gmail.com
> <https://groups.google.com/d/msgid/opencog/CAHrUA352FhH7VqtNq0xS1sffnHbqstHGO7cixBzmjqNhK_VUjA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Linas Vepstas

unread,
Jan 26, 2018, 2:50:02 AM1/26/18
to Nil Geisweiller, opencog, Ben Goertzel, Parth Thakkar
On Fri, Jan 26, 2018 at 1:38 AM, Nil Geisweiller <ngei...@googlemail.com> wrote:


or maybe in its own repository?

Probably its own repo, esp if it needs to store some datasets.
 

Or perhaps beside the <ATOMSPACE_REPO/tests directory, being treated as some sort of alternate unit tests dedicated to benchmarking, living under <ATOMSPACE_REPO>/benchmark that would be invoked with

I like the principle, but ... the unit tests can be run 10x a day.  The benchmarks -- maybe only every few months, i.e. never, unless you are messing with the internals in such a way that performance will be altered.   Although some kind of periodic, automated checking would be good.

About 3-4 months ago, I accidentally slowed down the atomspace performance by maybe 5x or 10x and basically, no one noticed.  .. well .. I noticed, things ran slow for my big jobs, but I did not realize that this wasn't normal.  So it went undetected ... until I ran  the benchmarks a few weeks ago, and had to git-bisect 6 months of git to find the bug .. a seemingly "harmless" change...

that would be an argument for running benchmarks daily or weekly ...

--linas

make benchmark

?

Perhaps I'm crazy, but I believe the PM is already thread-safe. This tends to be the general coding principle used throught the code. It should be good to go, and blandly slapping in a omp::parallel_for should "just plain work".

Cool! It will probably uncover some holes, I suppose, but that looks like a good start.

Nil


--linas


    Nil



        Not as fancy as a parallel backtracker, but might work almost as
        well
        in most common cases... I'd need to think/study more ...




        On Wed, Jan 24, 2018 at 4:46 PM, 'Nil Geisweiller' via opencog
                email to opencog+unsubscribe@googlegroups.com
                <mailto:opencog%2Bunsubscribe@googlegroups.com>
                <mailto:opencog+unsubscribe@googlegroups.com
                <mailto:opencog%2Bunsubscribe@googlegroups.com>>.

                To post to this group, send email to
                ope...@googlegroups.com <mailto:opencog@googlegroups.com>
                <mailto:opencog@googlegroups.com
                <mailto:opencog@googlegroups.com>>.

                Visit this group at
                https://groups.google.com/group/opencog




            --
            You received this message because you are subscribed to the
            Google Groups
            "opencog" group.
            To unsubscribe from this group and stop receiving emails
            from it, send an
            email to opencog+unsubscribe@googlegroups.com
            <mailto:opencog%2Bunsubscribe@googlegroups.com>.

            To post to this group, send email to

            Visit this group at https://groups.google.com/group/opencog






    --     You received this message because you are subscribed to the Google
    Groups "opencog" group.
    To unsubscribe from this group and stop receiving emails from it,

    To post to this group, send email to ope...@googlegroups.com
    <mailto:opencog@googlegroups.com>.

    Visit this group at https://groups.google.com/group/opencog
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscribe@googlegroups.com <mailto:opencog+unsubscribe@googlegroups.com>.
To post to this group, send email to ope...@googlegroups.com <mailto:opencog@googlegroups.com>.
Reply all
Reply to author
Forward
0 new messages