PatternMatchEngine actively modifies its state during matching so it cannot be used from more than one thread. But we can make one copy for each thread and use copies safely, as PME doesn't reuse its state between different calls of explore_neighborhood() method (1).
Two main objects participate in matching: PatternMatchEngine instance and PatternMatchCallback instance.
PatternMatchCallback is different thing.
- (2) grounding() method modifies the state (saves matching results)
- (3) evaluate_sentence() method in DefaultPatternMatchCB uses temporary atomspace to evaluate things and cleans up atomspace after evaluation so it should be synchronized
- (4) scope_match()/link_match()/post_link_match()/post_link_mismatch() methods in DefaultPatternMatchCB share state between calls via pair of pointers: _pat_bound_vars and _gnd_bound_vars - in original code- push()/pop() methods are intended to modify the state but there are no nontrivial implementations
(1) - to not make unnecessary copies of PatternMatchEngine on each iteration -
(3) - I used thread local here to keep implementation as is. At the moment temporary atomspace is created once and reused each time when it is required to evaluate the term. grab_transient_atomspace() method gets it from cache using mutex to be thread safe, so it is not clear why it is kept in DefaultPatternMatchCB state. I agree that it can be removed from state and received each time when it is needed from cache. Performance have to be measured to understand if there is significant loss.
(4) - this is most complex case because there is no simple way using stack: link_match() sets _pat_bound_vars and _gnd_bound_vars and scope_match()/post_link_match()/post_link_mismatch() methods use it. Straightforward solution I see is to move _pat_bound_vars and _gnd_bound_vars initialization code out of DefaultPatternMatchCB into PatternMatchEngine and pass them to scope_match but it requires callback signature changing.
Thanks,Vitaly2018-04-25 18:42 GMT+03:00 Linas Vepštas <notifi...@github.com>:OK, now some comments re thread-local.
So, one standard idiom for storage that gets automatically cleaned up is to put that storage on the stack. When the stack unwinds, the storage is cleaned up automatically. I don't see why this is not sufficient in this case.
Separately, I kind of would have liked to talk to you about parallelizing the pattern matcher before you started writing code, instead of after. The "natural", simplest, easiest place to parallelize would be to make the initiate-search loops parallel, and then allow each search to run in a distinct thread. This, it would seem to me, to be a natural place to set up the per-thread stack: its located "at the root" of the search, and so is guaranteed to be alive, until that thread finishes.
Doing this would seem to obviate the need for some per-thread storage that shows up "from thin air", in the middle of some random code doing some random thing. I'm rather nervous about the debuggability and correctness of what you are proposing to do.
Anyway, this pull req is the wrong place to have this discussion.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.