Serious performance issue due to duplication of ground rules

61 views
Skip to first unread message

Jerry Ding

unread,
Mar 20, 2020, 5:09:03 PM3/20/20
to PSL Users
In preparation for making a more complex semantic inference model, I set up a framework of predicates and built a trivial demo out of it. I ran lazy MPE inference, and the process was much slower than I expected considering the simplicity of the demo. I realized it made 36,177 ground rules when I was expecting about 1,000.

By outputting the list of satisfactions, I found that there were a huge number of duplicates in the ground rules - some rules had nearly 100 identical copies. If the duplicates were removed, the final iteration of the lazy optimization would have 781 variables and 1312 objective terms, which is a really small convex optimization problem that ought to be solvable in less than a second.

Is there a workaround to produce fewer ground rules? If not, can a patch be made relatively soon?

Manually creating the target set is out of the question since the nature of my problem should activate a very sparse subset of all possible ground rules. I would have an intractable number of ground terms if I enumerate all potential targets in a combinatorial way, and I would need a fully fledged logical inference engine to prune this target set to the tiny subset that might actually have nonzero distance to satisfaction. From the 1312 unique terms it looks like the lazy atom manager in PSL did a good job identifying the relevant ground rules; it just didn't de-duplicate them properly.

I have the model and output logs attached if it helps.
all.psl
command.txt
data.zip
meta.data
output.txt

Eriq Augustine

unread,
Mar 20, 2020, 6:52:05 PM3/20/20
to Jerry Ding, PSL Users
Hey Jerry,

First, thanks a bunch for including all the necessary information/files.
It made my job MUCH easier.

You are right!
There is a bug in lazy inference that causes multiple ground rules to be created.

I am uploading the fix to the maven servers now as version 2.2.2.
However the maven index can take a while to update, so if you need the fix ASAP you can build a local version of PSL.
You will see the new version up here when the maven index is updated:

-eriq

--
You received this message because you are subscribed to the Google Groups "PSL Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to psl-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/psl-users/189949b0-1a96-4f9a-98bf-70d825dd4aa3%40googlegroups.com.

Jerry Ding

unread,
Mar 21, 2020, 3:32:33 PM3/21/20
to PSL Users
Eriq,

Thank you for the quick patch, I was experimenting with a local compiled copy where I changed the data structure in MemoryGroundRuleStore to be a set instead of a list, but your change also fixed the problem so I'll switch to version 2.2.2.

I'm not sure if you can see from my PSL file, but I am attempting to batch process a large number of sentences by concatenating multiple models together on top of a shared set of rules (as opposed to having a small model and a large number of closed predicates, since I'm setting up custom weights for my priors). Do you have any general tips on how I can maximize the efficiency of this kind of approach?

There were a few general quirks I noticed about the framework. It seems that using hard constraints instead of rules with strong weights is more likely to result in a message saying that ADMM has failed to find a feasible result, even if both didn't seem to violate any of the rules. And I find a trade-off between the strength of the negative priors I use to ensure deterministic results and the number of iterations needed for convergence - a strong negative prior tends to distort the inference results, but a weak negative prior doesn't converge easily. Lastly I noticed that for the best results I have to scale the step size of ADMM proportional to the number of sentences I process. Are these expected phenomenon, or signs that I'm using the tool in a weird way?

Jerry
To unsubscribe from this group and stop receiving emails from it, send an email to psl-...@googlegroups.com.

Eriq Augustine

unread,
Mar 21, 2020, 4:02:28 PM3/21/20
to Jerry Ding, PSL Users
Hey Jerry,

I have three different recommendations for performance depending on how large you expect your model to get:
  • About the same size as the code you gave on this thread:
    • You should be all set with what you are doing now.
  • In the order of millions (or tens of millions if you have a strong machine):
  • Even larger:
    • Do everything above.
    • Move away from lazy inference and specify your full model in your data files (I know this sounds counter intuitive).
    • Use Tandem Inference:
      • Use `--infer SGDStreamingInference` instead of `--infer LazyMPEInference`.
      • Paper for reference: https://linqs.soe.ucsc.edu/node/356
      • TI uses a disk cache and a fixed amount of memory. We have run models in the billions in 8-10 hours with 10GB of RAM.


It seems that using hard constraints instead of rules with strong weights is more likely to result in a message saying that ADMM has failed to find a feasible result, even if both didn't seem to violate any of the rules.
 
This has to do with how ADMM chooses priority between a configuration option that wants to break early (admmreasoner.objectivebreak) and satisfying the hard constraints.
Previously, ADMM will favor the configuration option.
However, we recently (just two weeks ago) changed our stance here and made satisfying constraints a higher priority:
(But this change has not been released in any version yet.)
You can disable this option that wants to break early like:
-D admmreasoner.objectivebreak=false

And I find a trade-off between the strength of the negative priors I use to ensure deterministic results and the number of iterations needed for convergence - a strong negative prior tends to distort the inference results, but a weak negative prior doesn't converge easily.

I would say that this is a per-model thing, and not unexpected.
The interaction of the other rules and constraints really determines how much influence the prior has.

Lastly I noticed that for the best results I have to scale the step size of ADMM proportional to the number of sentences I process. Are these expected phenomenon, or signs that I'm using the tool in a weird way?

This is also a per-model thing, and not unexpected.
It really matters how much contention there is between the ground rules.
With the objectivebreak change above, you may not need to scale the steps anymore (since more iterations will run).


From what I can tell, your modeling looks good.
Feel free to let us know if you are having any more issues.

-eriq


To unsubscribe from this group and stop receiving emails from it, send an email to psl-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/psl-users/1b93452d-770a-4315-b016-8dfd3e944c87%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages