Pros and cons of across iterator

43 views
Skip to first unread message

Finnian Reilly

unread,
May 17, 2018, 9:34:20 AM5/17/18
to Eiffel Users
I like the across iterator but I do not use it all the time as it has a number of distinct disadvantages.
  • It has an extra performance overhead as an iterator object needs to be created and then garbage collected therefore I would not use in a nested inner loop or in a routine that has the possibility of being iterated over a lot.
  • You cannot see the value in the debugger of it.item / it.key / it.cursor_index where it is the named as iterator item (or at least in the version of studio I am using) So using across makes your program harder to debug.
Surprisingly if you want the best performance you should use agent orientated do_all, for_all, do_if routines. It is surprising because normally we think of agent calls as being rather slow, but I remember benchmarking some loops and found this was the fastest (at least for an ARRAYED_LIST or an ARRAY). But you do in my opinion pay a price in readability, especially if you start using anonymous agents.

BTW
Incidentally I think EiffelStudio is lacking in convenient ways to generate standard iteration code. I have remedied this lack with a command line tool that can be setup as an external tool in EiffelStudio.

If I type
   my_routine
     
do
         
@from list.after
     
end

The tool expands it as
   my_routine
     
do
         
from list.start until list.after loop
            list
.forth
         
end
     
end


And if I type
   my_routine
     
do
         
@from i > x
     
end

it is expanded as
   my_routine
     
do
         
from i := 1 until i > x loop
            i
:= i + 1
         
end
     
end

There are other shorthand for common code constructs which get expanded too. This saves a lot of typing and increases productivity.

Larry Rix

unread,
May 17, 2018, 9:53:15 AM5/17/18
to eiffel...@googlegroups.com
Hi Finnian,

I agree that the across loop produces slow code. My comparison tests between from and across have shown that from is somewhere between 10x and 16x faster than the same thing in across.

I am very surprised to hear that the agent-based do_all (et al) calls are faster. Agents were presented to me as being rather slow overall. So, this surprises me. I have never benchmarked to find out.

With the across being so much slower than a from, it is agreeable that one ought never nest an across within an across unless one does not care about performance.

However—the purpose of this documentation task is not to get lost in the weeds of performance choices, but to write documentation that helps both the Eiffel veteran and the Eiffel novice (mostly the Novice me thinks—we want new consumers, while taking good care of our vets). Still—it might be worth noting the performance issues of the across loop in the documentation.

Larry Rix
Moonshot Software
Rocket science for everyone!
Savannah, GA

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/eiffel-users.
For more options, visit https://groups.google.com/d/optout.

Larry Rix

unread,
May 17, 2018, 10:33:30 AM5/17/18
to eiffel...@googlegroups.com
Hi again, Finnian,

I am going to modify my statements about across being slower. As I was writing code (again) to make that benchmark comparison, I realized that Manu's statement of "not-the-same-thing" came back into view of my thinking.

Previously, I was using a from loop with an INTEGER index variable of "i" and then just doing "i := i + 1" in the loop body. This is great and good, but the across loop is not the same thing.

In the across, I was using an INTEGER_INTERVAL to loop across as a comparison. This is NOT the same thing (e.g. 1 |..| i), where i := some large value like 1_000_000.

I made a change to the code such that I am now comparing apples-to-apples. I created a local ARRAYED_LIST variable called l_container and loaded it with 1_000_000 INTEGER values.

When I did this, the results were far different.

Building container with items ...
from start: 37611.642
from end: 37611.775999999998
dif: 0.13399999999819556
across start: 37611.777000000002
across end: 37612.387999999999
dif: 0.61099999999714782

Press Return to finish the execution...

The from loop is looping 1_000_000 times, incrementing "i" as it goes.

The across loop is going across the l_container with 1_000_000 INTEGER values.

As you can see (above), the across loop is now only 4x slower vs. 16-10x using my previous method. It seems the moral of the story is: How the code is structured matters together with how the compiler is forming up the C/C++ behind it.

Another run, produces this result:

Building container with items ...
from start: 37822.087
from end: 37822.230000000003
dif: 0.14300000000366708
across start: 37822.230000000003
across end: 37822.817999999999
dif: 0.58799999999610009

Press Return to finish the execution...

The results are still only 4x slower. I am wondering now what other optimizations I can make to bring this closer together.



Larry Rix
Moonshot Software
Rocket science for everyone!
Savannah, GA

On Thu, May 17, 2018 at 9:34 AM, Finnian Reilly <frei...@gmail.com> wrote:

--

Colin Adams

unread,
May 17, 2018, 10:37:59 AM5/17/18
to eiffel...@googlegroups.com
The Eiffel compiler ought often to be able to optimise across over an array by compiling to an integer index loop instead.

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

Larry Rix

unread,
May 17, 2018, 10:54:47 AM5/17/18
to eiffel...@googlegroups.com
I lowered the l_upper_bound from 1_000_000 to just 1_000. The results are interesting.

The across loop (in certain forms) is congruent with the from loop. The do_all agent is dismal in comparison.

Building container with 1_000_000 items ...

from start: 38919.322999999997
from end: 38919.324000000001
dif: 0.0010000000038417056

across l_container start: 38919.324000000001
across l_container end: 38919.324999999997
dif: 0.00099999999656574801

across (1 |..| l_upper_bound) start: 38919.326000000001
across (1 |..| l_upper_bound) end: 38919.326999999997
dif: 0.00099999999656574801

across l_interval start: 38919.328000000001
across l_interval end: 38919.330000000002
dif: 0.0020000000004074536

do_all start: 38919.330000000002
do_all end: 38919.586000000003
dif: 0.25600000000122236


Press Return to finish the execution...

  • Going across l_container is just as fast on 1_000 INTEGERs as the from loop (e.g. 0.001 vs. 0.0009).
  • The same is true of the INTEGER_INTERVAL variant.
  • Using a created INTEGER_INTERVAL object makes the across loop 2x slower than the others above.
  • The agent call to do_all is 256x slower, so this is in keeping with what I was told about using agents—they are EXPENSIVE!
This is all based on the following code:

local
i,
l_upper_bound: INTEGER
l_start,
l_end: TIME
l_container: ARRAYED_LIST [INTEGER]
l_interval: INTEGER_INTERVAL
l_result: BOOLEAN
do
-- Create the container
print ("Building container with 1_000_000 items ...%N%N")
l_upper_bound := 1_000 -- 1_000_000
create l_container.make (l_upper_bound)
across 1 |..| l_upper_bound as ic loop l_container.force (ic.item) end

-- from loop
create l_start.make_now
print ("from start: " + l_start.fine_seconds.out + "%N")
-- loop here ...
from
i := 1
until
i > l_upper_bound
loop
-- do loop stuff here ...
i := i + 1
end
create l_end.make_now
print ("from end: " + l_end.fine_seconds.out + "%N")
print ("dif: " + (l_end.fine_seconds - l_start.fine_seconds).out + "%N%N")

-- across loop (container)
create l_start.make_now
print ("across l_container start: " + l_start.fine_seconds.out + "%N")
-- loop here ...
across l_container as ic loop
end
create l_end.make_now
print ("across l_container end: " + l_end.fine_seconds.out + "%N")
print ("dif: " + (l_end.fine_seconds - l_start.fine_seconds).out + "%N%N")

-- across loop (interval_1)
create l_start.make_now
print ("across (1 |..| l_upper_bound) start: " + l_start.fine_seconds.out + "%N")
-- loop here ...
across (1 |..| l_upper_bound) as ic loop
end
create l_end.make_now
print ("across (1 |..| l_upper_bound) end: " + l_end.fine_seconds.out + "%N")
print ("dif: " + (l_end.fine_seconds - l_start.fine_seconds).out + "%N%N")

-- across loop (interval_2)
l_interval := (1 |..| l_upper_bound)
create l_start.make_now
print ("across l_interval start: " + l_start.fine_seconds.out + "%N")
-- loop here ...
across l_interval as ic loop
end
create l_end.make_now
print ("across l_interval end: " + l_end.fine_seconds.out + "%N")
print ("dif: " + (l_end.fine_seconds - l_start.fine_seconds).out + "%N%N")

-- do_all loop
create l_start.make_now
print ("do_all start: " + l_start.fine_seconds.out + "%N")
-- loop here ...
l_container.do_all (agent do_all_function)
create l_end.make_now
print ("do_all end: " + l_end.fine_seconds.out + "%N")
print ("dif: " + (l_end.fine_seconds - l_start.fine_seconds).out + "%N%N")

end

Larry Rix
Moonshot Software
Rocket science for everyone!
Savannah, GA

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

Colin Adams

unread,
May 17, 2018, 10:58:59 AM5/17/18
to eiffel...@googlegroups.com
Looks like you are measuring performance using workbench mode. That way lies madness.

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

Eric Bezault

unread,
May 17, 2018, 11:01:13 AM5/17/18
to eiffel...@googlegroups.com, Larry Rix
On 5/17/2018 16:54, Larry Rix wrote:
> * Going across l_container is just as fast on 1_000 INTEGERs as the
> from loop (e.g. 0.001 vs. 0.0009).
> * The same is true of the INTEGER_INTERVAL variant.
> * Using a created INTEGER_INTERVAL object makes the across loop 2x
> slower than the others above.
> * The agent call to do_all is 256x slower, so this is in keeping with
> what I was told about using agents—they are EXPENSIVE!

They are expensive if the GC collecting everything that you created
before is triggered at that time. You should run your benchmarks
separately, not one after the other in the same execution.

--
Eric Bezault
mailto:er...@gobosoft.com
http://www.gobosoft.com

Bertrand Meyer

unread,
May 17, 2018, 11:01:40 AM5/17/18
to eiffel...@googlegroups.com, me...@inf.ethz.ch

All this is extremely interesting.

 

The contributors are expert Eiffel users. I would like to make sure that we keep the needs of novices in mind. If an iteration over 1 |..| 10 is not optimal, as Alexander Kogtenkov pointed out, this is important for advanced users (for whom, presumably, the top value is much more than 10 – for 10, who really cares?). But this may not be the first information that you have to learn about iteration. If iterating about 1 |..| n is simple and straightforward, as it is in fact (in my opinion at least), the documentation should highlight this form of iteration, early on. After people have learned the concepts in a first section, they can see a later section on “Performance considerations”. In the first section, the key goal, perhaps the only one, is to teach the basic concepts concisely, so that people appreciate their simplicity and ease of use, and can start using them right away. (Yes, I know, there’s the risk that someone will get carried away, write “across 1 |..| 10_000_000” and post that this Eiffel thing is horribly inefficient. But the alternative is much worse: people being presented with techniques to be told immediately that they are bad! One should never do that. Present the techniques from the viewpoint of clear, elegant, easy-to-learn concepts, and discuss optimization separately.)

 

Another example is tuples. Performance-conscious large-scale Eiffel programming often avoids repeated calls of the form r ([a, b, c]) if it can instead use r (t) where t is a tuple that can be reused, rather than allocated anew each time. (Rough description of a general heuristic, not exact advice.) This is not something we want to introduce first when describing tuples. We explain the concept, and for those who go further we provide optimization techniques.

 

All this not to disagree with any of the previous message, but to remind ourselves that we are not just talking to each other.

 

With best regards,

 

-- Bertrand Meyer

LASER summer school, Elba island,2-10 June, bitcoin/blockchains, https://www.laser-foundation.org/school/ 

Eric Bezault

unread,
May 17, 2018, 11:03:13 AM5/17/18
to eiffel...@googlegroups.com, Larry Rix
Furthermore, you should compare loops that do something
(the same thing, but something). Loops that do nothing
can be optimized away by the underlying C compiler.

lrix

unread,
May 17, 2018, 11:12:15 AM5/17/18
to Eiffel Users
I was noting some wild differences in my benchmark code runs. I noted that benchmark times were wild in Workbench mode and changed most greatly if I did a freeze. That led to me just finalize the code to an EXE and try again.

Once the code was in finalized, the wild swings continued. At this point, I am becoming more convinced that I cannot get a good benchmark because of the unpredictable nature of the OS and other programs I have running.

Here is a sample from my finalized:

Building container with 10000 items ...

from start: 40126.902999999998
from end: 40126.906999999999
dif: 0.0040000000008149073

across l_container start: 40126.908000000003
across l_container end: 40126.909
dif: 0.00099999999656574801

across (1 |..| l_upper_bound) start: 40126.913
across (1 |..| l_upper_bound) end: 40126.919000000002
dif: 0.0060000000012223609

across l_interval start: 40126.919000000002
across l_interval end: 40126.921000000002
dif: 0.0020000000004074536

do_all start: 40126.921000000002
do_all end: 40126.921999999999
dif: 0.00099999999656574801

There are times that I can run this and the dif values are so low that they are shown as 0. In the example above, the across using the l_interval is twice as fast as the from loop. I can run this again and again and the numbers will continue wild swings. I do think this is now OS-level latency and not just my little program.


:-)

Larry

lrix

unread,
May 17, 2018, 11:13:04 AM5/17/18
to Eiffel Users
Ah, good to know! Then perhaps this is not just OS related but simply the GC getting involved as well, yes?

Larry Rix

unread,
May 17, 2018, 11:16:03 AM5/17/18
to eiffel...@googlegroups.com, me...@inf.ethz.ch
Totally agree that the point of this tasking is to train at a high level and leave the granular and more nuanced matters to later material. Perhaps an entire section of docs that deals with performance issues and benchmarking both within Eiffel and between Eiffel and other language results for similar algorithms?

Larry Rix
Moonshot Software
Rocket science for everyone!
Savannah, GA

--

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

Finnian Reilly

unread,
May 18, 2018, 12:22:33 PM5/18/18
to Eiffel Users
Hi Larry,
I have revisited my benchmarks again and written this article on eiffel.org
which compare 5 different iteration styles and examines the difference caching the array count as a local variable makes.

We seem to be getting quite different results especially on do_all which ties with across for second place in my benchmarks. Can you elaborate on the "runtime conditions" you did your benchmarks with. Mine are as follows:

Runtime Conditions

EiffelStudio version: 16 (16.05.9.8969 GPL Edition - linux-x86-64)

C Compiler: gcc version 4.8.4

Exe type: finalized exe

Code inlining: level 2

OS: Linux Mint 17.1


regards
Finnian
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

Larry Rix

unread,
May 18, 2018, 12:37:16 PM5/18/18
to eiffel...@googlegroups.com

Runtime Conditions

EiffelStudio version: 18.1 (18.01.10.1424 GPL Edition - win64)

C Compiler: Microsoft

Exe type: finalized exe

Code inlining: (inlining size = empty)

OS: Windows 10



Larry Rix
Moonshot Software
Rocket science for everyone!
Savannah, GA

To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users+unsubscribe@googlegroups.com.

Finnian Reilly

unread,
May 18, 2018, 12:42:10 PM5/18/18
to Eiffel Users
It would be interesting to see what effect setting inline size to 2 would have on your results. I just realized I was calling MEMORY.collect instead of MEMORY.full_collect after each benchmark. Will need to recompile and run again.

Finnian Reilly

unread,
May 18, 2018, 1:21:36 PM5/18/18
to Eiffel Users
Revised Iteration Benchmarks
I have revised my benchmarks article on eiffel.org when I realised I made a mistake and was only doing a partial garbage collect after each benchmark iteration run. Now with a full collect do_all is marginally (but consistently by 1%) beating across. Also the results are much more uniform between try 1 and try 2.
Reply all
Reply to author
Forward
0 new messages