Performance Results (can't make a sense out of it)

Skip to first unread message

Pouyan Ziafati

Nov 29, 2014, 8:43:25 AM11/29/14
Dear All,

I am measuring performance of an application for NAO robot involving assertion, retraction and look-up of terms. However, I cannot interpret the results.

The application receives a continuous stream of data. Each data item is a ground term of type tf(X,Y,Z,W) where X and Y are atoms and Z and W are lists of float numbers.
Data items are selectively recorded by, so called, memory instances. For example, the memory_instance(tf(X,Y,Z,W), 2500, id1) is a memory instance, whose id is id1, that records all input data. The size of the memory instance is 2500 which means only the last 2500 data items are kept in memory. After that, for each new item recorded, the oldest one is deleted. Another example is the memory_instance(tf(head,camera,Z,W), 2500, id2) that maintains the record of data of type tf(head,camera,Z,W).

An input data item is recorded by relevant memory instances by calling the feed_memory_instances(Data) function, implemented as follows. This function implements a failure driven loop, going over all memory instances, and records the data item for  matching memory instances. If a memory instance has reached its size limit, the oldest data from that memory instance is deleted. There is a counter for each memory instance, which counts the total number of recorded items by the memory instance. Counters are kept using global variables and are initially set to zero.

feed_memory_instances(Data) :-
                                NewCounter is Counter + 1,
                                asserta(memItem(HashNew, Id, NewCounter, Data)),
                                    Counter >= Size ->   
                                    Old is Counter - Size +1,
                                 retract(memItem(HashOld,Id,Old, _))           

Performance: The attached graph shows performance results of memory instances. The "NAO Example" is an application for the NAO robot. We add varying number of memory instances to this application and measure the increase in CPU time. The frequency of imput data is about 1900 data items per second.

The problem:  The graph shows that when adding 10 memory instances, CPU time increases by about 10 percents: Compare the results for cases where there are 20, 30 and 40 memory instances. However, when adding the first 10 memory instances, the performance increase is about 20 percents. I have been tracing the program, making various runs, ... So I strongly believe the program works as it shoud. I just cannot make a sense out of this results. Any idea?

Many thanks,


Jan Wielemaker

Nov 29, 2014, 10:56:39 AM11/29/14
to Pouyan Ziafati,
On 11/29/2014 02:43 PM, Pouyan Ziafati wrote:
> Dear All,
> I am measuring performance of an application for NAO robot involving
> assertion, retraction and look-up of terms. However, I cannot interpret
> the results.

With my poor eyes, interpreting the colours is a bit hard. It is also
unclear what you are really timing. I guess there is more involved than
the feed_memory_instances/1?

By intuition, I would use separate predicates for each `memory instance'
and add at the end if it doesn't matter, so you can retract from the
start for cleanup. So, you can something like this:

memory_instance(tf(X,Y,Z,W), 2500, id1(X,Y,Z,W)).

feed_memory_instances(Data) :-
memory_instance(Data, Size, Clause),
( predicate_property(Clause, number_of_clauses(N),
N > Size
-> functor(Clause, Name, Arity),
functor(Generic, Name, Arity),
; true

This stores less data and allows proper indexed access to your memory

To get further insight, run the experiment using ?- profile(Goal). I'd
suspect that (lack of) indexing is an issue. A couple of thousends of
new facts per second should really not be a problem.

Cheers --- Jan
> --
> You received this message because you are subscribed to the Google
> Groups "SWI-Prolog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
> <>.
> Visit this group at
> For more options, visit

Pouyan Ziafati

Nov 29, 2014, 12:35:37 PM11/29/14
Thanks Jan.

I'll modify the code according to your suggestions. That would improve the performance, but  I doubt that it would answer my question. What I cannot understand is the following:

I have an application which run with 20 % of CPU time. I call this application base.

If I add 10 memory instances to base, CPU time is 40 %.
If I add 20 memory instances to base, CPU time is 50 %.
If I add 30 memory instances to base, CPU time is 60 %.
If I add 40 memory instances to base, CPU time is 70 %.

As expected, CPU time linearly increases with respect to the number of memory instances. However, this is not the case when the first 10 memory instances are added. The first 10 memory instances results in increase of 20 percents. After that, each 10 memory instances results in increase of 10 percents. Intuitively, you expect that, if there is a problem with indexing, adding more number of memory instances should result in poorer performance. However, it seems indexing on the hash function works well here. I just cannot understand the overhead when the first set of memory instances are added.

Kindest regards,

PS: I'll give the profile(Goal), as soon as I find some time.
Reply all
Reply to author
0 new messages