As facts and clauses asserted using assert/1 or one of its derivatives become
part of the program, these predicates compile the term given to them.
retract/1 and retractall/1 have to unify a term and therefore have to
decompile the program. For these reasons the assert/retract mechanism is
expensive. On the other hand, once compiled, queries to the database are
faster than querying the recorded database discussed below.
My first question is, what exactly is the cost of assert/retract? How
expensive is that mechanism?
Next: what are the alternatives? Is it possible to compile a predicate without
compiling the whole program? If so, is it more economical to limit database
manipulation to assert operations, avoiding retract ones as much as possible?
Also: are the costs so high that one would get significant gains by using a
different relational database, eg a SQL database, to read from and write terms
to during execution?
And finally: by Clocksin & Mellish, findall, setof etc are implemented using
assert/retract. That must mean that such predicates also incur the costs
discussed above. Is that correct?
I try to avoid assert/retract and record/erase if possible. By using plain Prolog data structures to track my data, I often get cleaner code and excellent performance. If that's not possible, you could try flag/3 which is 35% faster than record (730 ns), although the semantics are quite a bit different.
Another possibility is to place a Prolog data structure in a global variable. I consider this more of a hack than a good idea. It also has different semantics on backtracking. However, it gives you "assertions" that are roughly as fast as record/2 (1300 ns) and gives you erasure almost for free (0 ns).
You'll probably have to measure your specific configuration to know for sure. There are substantial variations between databases and across networks.
A fast, local key-value store like LevelDB might also be an option. I'm not aware of any existing bindings for SWI Prolog though.
findall and friends use an internal data structure written in C (see src/pl-bag.c in the source code) so I suspect they're much faster than an implementation built on assert/retract.
I usually manage without assert/retract also. Recently though I wanted to store a large number of compound terms that represent data associated with points on a three-dimensional grid, for a puzzle. Previously, I would put them all in a list and scan it repeatedly. The problem is that once you have more than a dozen terms or so in a list like that, it becomes rather cumbersome and a pain to debug. So I thought I could take advantage of Swi's indexed database and assert my point-terms with an extra first argument for a hash and retrieve them directly from the database as needed. I think that's faster than scanning a list but if the cost of asserting/retracting counters that speedup then there's no, er, point in it.
I wouldn't rule out global variables that quickly. They can be quite nice.
Global variables save you from having to specify arguments in [predicates].
Take full advantage of this. Elect one or more of these global variables to
specify what kinds of processes to do on the others. Maintenance programmers
foolishly assume that [Prolog predicates] will not have side effects.
Were balanced trees (such as the ones from library(assoc)) not appropriate for this task?
Hi Jan,Hmmm. This seems to contradict you:
On Thursday, 8 May 2014 09:31:07 UTC+1, Jan Wielemaker wrote:I wouldn't rule out global variables that quickly. They can be quite nice.Global variables save you from having to specify arguments in [predicates].
Take full advantage of this. Elect one or more of these global variables to
specify what kinds of processes to do on the others. Maintenance programmers
foolishly assume that [Prolog predicates] will not have side effects.
Taken from here: http://arxiv.org/pdf/0911.2899v3.pdf :)
More seriously- global variables look nice yes, particularly where they live on
the stack and they're thread-local. But I didn't consider them because I'm
going through this phase where I want to have everything separated into
modules with crisp interfaces and so on. The documentation hints that it would
be possible to have per-module global variables; that would be great :)
Hi other Michael,
On Thursday, 8 May 2014 11:38:03 UTC+1, Michael Ben Yosef wrote:
Were balanced trees (such as the ones from library(assoc)) not appropriate for this task?
I seem to remember it was library(rbtrees) that was the fast implementation of
a balanced tree, whereas library(assoc) didn't give you any gains in
performance?
Anyway, I think the reason why either balanced tree library is not my first
option is because the notation is a little on the cumbersome side. It's an
option, certainly, though the 'value' part would have to be a compound or a
dict anyway.
Another reason why asserting/retracting compound terms seems attractive to me
is because I've been trying to look at the Prolog database as if from the
point of view of a SQL developer. In that sense, retrieving and updating data
stored in simple relations seems very natural and hassle-free. Even from the
point of view of someone who knows some Prolog already, keeping your 'data' as
facts associated with a module is a very simple and nice way to use the Prolog
database as, well, a database.
I understand well enough how asserting/retracting in the middle of a query can
cause trouble but imagine a typical website or even some piece of enterprise
software, only with Prolog as a logic engine and/or backend. What would you do
if you had to store, say, 100,000 records of users? It would be nice to be
able to design that part of the program exactly as a traditional relational
database (only much simpler because you could do away with all the foreign
keys clutter) and then actually create it in Prolog rather than SQL.
Regards,
Stassa
--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.
Visit this group at http://groups.google.com/group/swi-prolog.
For more options, visit https://groups.google.com/d/optout.
On Saturday, May 10, 2014 8:53:12 AM UTC+1, Jan Wielemaker wrote:
Just name them <module>_xyz and it should not really be a problem.
On Saturday, May 10, 2014 8:53:12 AM UTC+1, Jan Wielemaker wrote:
It is only the complexity of the term
during debugging that hinders you if I recall correctly.
On Friday, May 9, 2014 4:07:16 PM UTC+1, boris.vassilev wrote:
It comes down to your very specific use case. You can program it, best in little code that takes not too long to write and that is "obviously" correct. If the program solves your problem fast enough, then you are done. If it doesn't, and you plan to run your program many times, you would need actually profile.
On Friday, May 9, 2014 4:07:16 PM UTC+1, boris.vassilev wrote:
As for using the Prolog database, my impression is that it gives a cleaner solution only if you normalize your data, which is a bit forced with some kinds of data. And it has to fit in memory (maybe I am wrong about this?).
--
I think the fact that subPropertyOf reasoning does not deal with
SubPropertyOf SubPropertyOf has mostly theoretical value. I've
not see it happen in practice, but here you seem to have control over
your data, so it is not an issue at all. The RDF database offers some
nice things over the Prolog one. One is a standard import and export
format. Others are quite advanced indexing and proper transactions.
And no, it does not do assert/retract. The data is stored using
a linked in C library.
To a certain extend, these are all pretty interchangeable. They all
have different properties if it comes to persistency, startup times,
performance, etc. See also http://www.swi-prolog.org/FAQ/PrologLAMP.txt
The choice between using Prolog terms and some form of side-effect
storage also depends on how you want your data to behave on backtracking.
Using the Prolog dynamic db as a relational DB is not uncommon. To name
two projects: JTransformer
(https://sewiki.iai.uni-bonn.de/research/jtransformer/start) and Thea
(http://www.semanticweb.gr/thea/). Especially since SWI-Prolog (like
YAP) has JIT index on multiple
arguments, you typically get really great performance.
just wondering: why are you not considering using a Prolog term that is a data
structure fitting your exact problem (as was Michael's suggestion earlier as
well)?