[erlang-questions] ets vs list

40 views
Skip to first unread message

Max Lapshin

unread,
Sep 13, 2010, 8:58:25 AM9/13/10
to Erlang-Questions Questions
erlyvideo has video streams inside, each video stream is an object,
that tracks list of subscribed clients.
Each client can be in several states.

On each incoming frame clients are checked in this list and frame is
delivered to each. Also, clients may change their state and it happens
more rare, than selecting them from list.

Proportions are so: from 1000 clients, about 850 are usually in state
"active", 120 in "starting", 30 are "paused"
"active" clients receive each frame.

I was using ets to store their pids, but today I've created benchmark:
http://gist.github.com/577086

and it seems, that my choice was wrong:

15> c(bm_ets).
{ok,bm_ets}
16> bm_ets:run().
Initialized, starting
[RO] ETS: 3987268, List: 422858
[RW] ETS: 4043659, List: 1213914
ok


I can interpret these results so:
list is 10 times faster than ets on often full table scans and is 4
times faster, than 50/50:scan and update


My questions are:
1) are my conclusions right?
2) what ets is for, if list is fast enough on inserts? ets happened to
be unusable in creating timeshift (storing large binaries) because of
copy, it is much slower on extracting large amount of small objects.
Where should I use it?

________________________________________________________________
erlang-questions (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-questio...@erlang.org

Ulf Wiger

unread,
Sep 13, 2010, 9:11:27 AM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 02:58 PM, Max Lapshin wrote:
>
> My questions are:
> 1) are my conclusions right?
> 2) what ets is for, if list is fast enough on inserts? ets happened to
> be unusable in creating timeshift (storing large binaries) because of
> copy, it is much slower on extracting large amount of small objects.
> Where should I use it?

In your benchmark, you are doing full scans of the data set.
Lists are hard to beat then.

Ets tables are primarily good when you need random access, in which
case lists are pretty bad, with O(N) complexity. On the other hand,
for small sets, even lists can be quite competitive for random
access.

BR,
Ulf W

Jesper Louis Andersen

unread,
Sep 13, 2010, 9:45:47 AM9/13/10
to Max Lapshin, Erlang-Questions Questions
On Mon, Sep 13, 2010 at 2:58 PM, Max Lapshin <max.l...@gmail.com> wrote:

> and it seems, that my choice was wrong:
>

In addition to the comments of Ulf: When you run the ets:select/2
call, you are copying all of the ETS table into the memory space of
the process and then operate on it. You don't need that for the List
case since the memory is already in the heap of the process (Line 47).
Line 48 can be merged with the match-spec in Line 46 so you avoid to
create yet another list. This will also mean you (at least in theory)
copy less data.

In Haskell, you can rely on the optimizer to fuse several traversals
of a list into a single run over the list. This is not the case for
strict/eager languages, so you need to optimize list access to only
run a single pass.

--
J.

Rapsey

unread,
Sep 13, 2010, 9:49:39 AM9/13/10
to erlang-q...@erlang.org
If you need random access and fast traversal, dict and gb_trees are all good
and quite fast options.


Sergej

Ulf Wiger

unread,
Sep 13, 2010, 10:29:21 AM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 02:58 PM, Max Lapshin wrote:
>
> 2) what ets is for, if list is fast enough on inserts?

The main unique feature of ets is that data residing in
ets is not garbage-collected.

Roughly speaking, the cost of GC is proportional to the
amount of live data, so if you have very large data sets,
process heap-based data will have an increasing hidden
cost in GC sweep/copy time, even though access times may
look excellent in benchmarks.

Running benchmarks for longer periods should allow you
to include amortized GC cost in the results, but you
will also get a bunch of other noise. Tracing on GC
events with timestamps may be a more accurate method.

BR,
Ulf

Morten Krogh

unread,
Sep 13, 2010, 10:41:42 AM9/13/10
to erlang-q...@erlang.org
What exactly does it mean that data residing in ets isn't garbage
collected?

Does it mean that if a {key, Value} pair is in the ets table T,
and I write

ets:delete(T, Key)

then Value will not be garbage collected.

Morten.

Max Lapshin

unread,
Sep 13, 2010, 10:44:15 AM9/13/10
to Ulf Wiger, erlang-q...@erlang.org
But data from ets will be copied on each incoming frame, so it will
also be GC-ed

Ulf Wiger

unread,
Sep 13, 2010, 10:58:04 AM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 04:41 PM, Morten Krogh wrote:
> What exactly does it mean that data residing in ets isn't
> garbage collected?
>
> Does it mean that if a {key, Value} pair is in the ets table T,
> and I write
>
> ets:delete(T, Key)
>
> then Value will not be garbage collected.
>
> Morten.

If you delete an object from an ets table, it will be freed
on the spot - think malloc/free.

If you don't explicitly delete it, it will stay there, and
will never be touched by a garbage collector. The garbage
collector cares about process heaps and binaries, but not
about ets. Once an object is copied into ets, it is outside
the jurisdiction of the GC; when read from ets, it is copied
onto the process heap, and this copy becomes subject to GC.

Erlang's per-process garbage collector periodically copies
live data to a new instance of the process heap. There are
a few variations of the theme, but basically, that's how
it works. When some data is no longer referenced, it becomes
garbage on the heap, and eventually, the heap will fill up
with garbage and live data, triggering the garbage collector.

In this sense, GC is really a misnomer, since it collects,
and copies, live data, rather than messing about with the
garbage. This is also the reason why GC cost can be said
to be proportional to the amount of live data, rather than
the amount of garbage. Producing lots of garbage, though,
will trigger GC more frequently, causing more scans and
copying of the live data.

BR,
Ulf W

Robert Virding

unread,
Sep 13, 2010, 10:58:44 AM9/13/10
to Morten Krogh, erlang-q...@erlang.org
All data in an ets table will be garbage collected! It is perfectly
safe to insert and delete data as often as you need.

What Ulf meant was that an ets table is not garbage collected in the
same way as a normal process and that you don't run into same problems
with overly long garbage collections times as you can with a processes
that have a *LOT* of local data. This is why it is safe to have large
ets tables and why you should be careful in having *LARGE* databases
as local data to a process. This is a problem using dict, gb_trees and
array, and lists.

The downside of this is that data has to be copied from an ets table
into the process heap before it can be used, which is why we have
ets:match and ets:select which allow you to perform more tests on
potential data before copying them into the process heap. This is a
benefit using dict, gb_trees and array, and lists.

So they are all *safe* but perform differently.

Robert

Morten Krogh

unread,
Sep 13, 2010, 11:16:17 AM9/13/10
to erlang-q...@erlang.org
Ulf and Robert,
thanks for your answers. It makes sense that delete calls free in the
malloc sense.

What I don't get now, is what data the data in ets that is not garbage
collected, actually is.
Does the ets process have data beyond the table itself?
Are we talking about intermediate values used in the match function etc?

Morten.

Ulf Wiger

unread,
Sep 13, 2010, 11:43:38 AM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 05:16 PM, Morten Krogh wrote:
> Ulf and Robert,
> thanks for your answers. It makes sense that delete calls free in the
> malloc sense.
>
> What I don't get now, is what data the data in ets that is not garbage
> collected, actually is.

Hmm, not sure I understand the question, but data copied into ETS is
managed separately from the process heap data. The Erlang VM has
a system of "memory carriers" to manage the allocation of different
types of data, in order to fight memory fragmentation. ETS data, which
tends to be more long lived and of different granularity than process
heaps, is stored in separate carrier blocks. Binaries are stored in
other types of carriers, etc. But they are just regular Erlang objects.

Basically, you have a few different kinds of data inside the VM.
To name a few:
- The atom table, is never garbage-collected; atoms are never deleted
- ETS tables, whole tables are automatically removed when the owner
process dies; individual objects remain while the table remains,
unless explicitly deleted
- Large binaries, reside on a shared heap; they are reference-counted
- Process heaps, are managed with a stop-and-copy garbage collector;
when a process dies, the whole process heap is freed, informing
monitoring and linked processes, as well as decrementing binary
reference counters and deleting any ETS tables owned by the process.

> Does the ets process have data beyond the table itself?
> Are we talking about intermediate values used in the match function etc?


Ah, the select function actually executes in a special 'pseudo process'
inside the Erlang VM. It has its own little stack machine, called the
PAM ("PAM" stands for "Patrik's Abstract Machine", I think, where
"Patrik" stands for Patrik Nyblom).

erts/emulator/beam/erl_db_util.c

So garbage resulting from a select operation accumulates on the
pseudo process heap and is subject to the same kind of GC as
any regular process (I assume).

There's a _lot_ to memory management in the Erlang VM. I don't
claim to understand more than a tiny part of it.

BR,
Ulf W

Morten Krogh

unread,
Sep 13, 2010, 12:10:21 PM9/13/10
to erlang-q...@erlang.org
Thanks again.

My question might have been hard to understand because I made an
assumption that might be wrong, namely that
an ets table has a corresponding process. I don't mean the owning
process, but an implicit process that is spawned automatically
when the table is created, and that all insertions and deletions are
going through that process. Is that just my imagination?
Is an ets table only a hash table and nothing more? No corresponding
process that serializes all access to it? ETS:lookup is just a function
call, not a hidden
message pass, receive construct or what?

Morten.

Ulf Wiger

unread,
Sep 13, 2010, 12:25:49 PM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 06:10 PM, Morten Krogh wrote:
> Thanks again.
>
> My question might have been hard to understand because I made an
> assumption that might be wrong, namely that
> an ets table has a corresponding process. I don't mean the owning
> process, but an implicit process that is spawned automatically
> when the table is created, and that all insertions and deletions are
> going through that process. Is that just my imagination?
> Is an ets table only a hash table and nothing more? No corresponding
> process that serializes all access to it? ETS:lookup is just a function
> call, not a hidden
> message pass, receive construct or what?
>
> Morten.

ETS tables do not have a corresponding process (except for the
temporary pseudo process during select()). Serialization is done
with mutexes, so essentially, it's just a hash table or B-tree.

That said, Erlang purists often hide behind the fact that you
_could_ model ETS tables with processes and message passing
(thus, process spawning and messaging are still the only
means of side-effecting in Erlang ;-)

It's not terribly difficult, but hard to do efficiently. The
main issue is named tables, since you need a table registry,
which implies another set of messages. I did this once (and
as it happened, Robert V was doing the same thing at the same
time - I was doing it for the Erlang Processor, and he was
doing it for his own VM, VEE). As far as I recall, performance
of my ETS emulation was ca 30x slower than regular ETS, at
least on named tables*. :) This was many years ago, so the
numbers may be completely different today.

BR,
Ulf W

* But, as it happened, the Erlang Processor used ca 30x fewer
clock cycles than my workstation to execute Erlang code, so
ultimately I thought myself no worse off. :)

Morten Krogh

unread,
Sep 13, 2010, 1:11:46 PM9/13/10
to erlang-q...@erlang.org

Thanks. I just tried in the shell and saw that there were no new
processes after creating an ETS table. fine.

Surely, you could model ets with a process and message passing. Why
should it be slower? The only overhead should be the message passing.

So, I guess your benchmark might depend on the size of the key/value
pairs. For large values, the performance should be the same, for small
values the process might be a bit slower.

So, if we compare ets with dict or the process dictionary, then ets
should perform really badly on large values, it seems.

Suppose we have a giant list, that we store in the tables with some key,
i.e, store(key) = [..,..,,,...............................]

If we want to traverse that list, dict and pd would start immediately,
whereas the list from ets would have to be copied before traversing
could start.

And after reading the list from ets it would have to be freed at the
next garbage collection, where as the dict and pd didn't generate any
garbage.

Morten.

Ulf Wiger

unread,
Sep 13, 2010, 2:12:32 PM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 07:11 PM, Morten Krogh wrote:
>
>
> Thanks. I just tried in the shell and saw that there were no new
> processes after creating an ETS table. fine.
>
> Surely, you could model ets with a process and message passing. Why
> should it be slower? The only overhead should be the message passing.

No, there are several other overheads. For instance, you must somehow
implement a hash table from immutable dynamically typed data structures,
whereas in ETS (in C), you can use mutable arrays. The dict module,
for example, uses a tree structure of tuples. Once you've located
the leaf tuple where the object is to reside, you have to create
a new copy of that tuple, which means you must also update (copy)
the parent tuple, etc.

Also, when traversing the tree, you cannot avoid the type checks
ensuring that it's actually a tuple you're stepping into. In the
case of C, you will not be performing runtime type checks on the
hash space.

OTOH, with SMP Erlang, you have to protect the ETS tables with
mutexes, whereas on the process heap, you don't have to worry
about contention.

> So, I guess your benchmark might depend on the size of the key/value
> pairs. For large values, the performance should be the same, for small
> values the process might be a bit slower.

In the case I described, the task was to model the ETS API
faithfully, which meant a separate process per table. This also
included copying, as data sent in messages is copied from
process to process. Otherwise, if you use dict, gb_trees, etc.
or the process dictionary, you can avoid the copying and get
quite good performance if the stored data structures are large.

> So, if we compare ets with dict or the process dictionary, then ets
> should perform really badly on large values, it seems.

Yes. OTOH, the process dictionary is difficult to share...


BR,
Ulf W

Morten Krogh

unread,
Sep 13, 2010, 3:05:19 PM9/13/10
to erlang-q...@erlang.org

> No, there are several other overheads. For instance, you must somehow
> implement a hash table from immutable dynamically typed data structures,
> whereas in ETS (in C), you can use mutable arrays. The dict module,
> for example, uses a tree structure of tuples. Once you've located
> the leaf tuple where the object is to reside, you have to create
> a new copy of that tuple, which means you must also update (copy)
> the parent tuple, etc.

But it is to some extent self imposed that you want a functional data
structure.
You could have used the process dictionary for this or some foreign
interface.
Nowadays nif, but that was not available to you then.

Why didn't you use the process dictionary? It is made for this situation.

Morten.

Ulf Wiger

unread,
Sep 13, 2010, 6:17:23 PM9/13/10
to erlang-q...@erlang.org
On 09/13/2010 09:05 PM, Morten Krogh wrote:
>
>> No, there are several other overheads. For instance, you must somehow
>> implement a hash table from immutable dynamically typed data structures,
>> whereas in ETS (in C), you can use mutable arrays. The dict module,
>> for example, uses a tree structure of tuples. Once you've located
>> the leaf tuple where the object is to reside, you have to create
>> a new copy of that tuple, which means you must also update (copy)
>> the parent tuple, etc.
>
> But it is to some extent self imposed that you want a functional data
> structure.

Sure. I'm not trying to criticize Erlang, but merely to point
out what makes ETS tables implemented in the VM faster than
an Emulation of them in Erlang code.


> You could have used the process dictionary for this or some foreign
> interface.
> Nowadays nif, but that was not available to you then.

But in both cases (for Robert and for me), the whole _point_ was
to implement them in Erlang, since we wanted to get the functionality
in place without having to spend a lot of time with a native
implementation. The goal was not speed at all, but productivity.

Robert Virding

unread,
Sep 13, 2010, 8:40:38 PM9/13/10
to Morten Krogh, erlang-q...@erlang.org
On 13 September 2010 21:05, Morten Krogh <m...@amberbio.com> wrote:
>
>> No, there are several other overheads. For instance, you must somehow
>> implement a hash table from immutable dynamically typed data structures,
>> whereas in ETS (in C), you can use mutable arrays. The dict module,
>> for example, uses a tree structure of tuples. Once you've located
>> the leaf tuple where the object is to reside, you have to create
>> a new copy of that tuple, which means you must also update (copy)
>> the parent tuple, etc.
>
> But it is to some extent self imposed that you want a functional data
> structure.
> You could have used the process dictionary for this or some foreign
> interface.
> Nowadays nif, but that was not available to you then.
>
> Why didn't you use the process dictionary? It is made for this situation.

The problem comes back to memory management and garbage collection. A
process heap is not collected incrementally but in one go. During that
time the cpu/core/thread running that process will stop. This is no
problem as most process heaps are not large and there is no noticeable
pause. Also the collector is optimised to keep these times down. It
has to be done in this way as there no other way to detect when terms
are free except by scanning the heap.*

In an ets table heap however, as you do explicit insert and delete,
collection takes part every time you do a delete so there are no long
pauses at all. This is irrespective of how big the ets table is, which
means is to possible to have LARGE tables without any pause times for
collection. But this also means that the data in an ets table cannot
exist in a normal process heap but must be separate from it. This is
why you get the copying of data between processes and ets tables. This
is why you have ets:match and ets:select which allow you to do more
checking of data in the ets pseudo-process before copying it over to a
process heap, which is a big win when scanning a table.

Now using the process dictionary would allow you to use the same
algorithms for managing the table itself as with ets tables but it
doesn't solve the big problem of where to store the actual data.
Either you keep it in the process heap (which is done today) but then
incur the all problems with having LARGE process heaps, or you keep it
in a separate heap but then incur all the problems with copying. In
either case there is not a clear benefit in using the process
dictionary.

You get the same problem if you were to try to use NIFs. Where do you
put the data? The process heap should only contain valid erlang terms
and does not support mutable data, which limits how much smarter you
can actually make it by not using plain erlang. Though you could
probably improve dict by adding some NIFs, definitely something to
consider for the future.

There is another point as well and that is that using the process
dictionary gives the code side effects and makes it non-functional.
Ah, you might say, but does not using ets do the same thing? Yes, but
ets tables behave as processes (and can be emulated using processes)
which confines all side effects to inter-process communication. Which
is a Good Thing. If this worries you is another thing but it is nice
to be able to follow the flow of state through your code as much as
possible.

As usual it all boils down to which trade-offs you are willing to
make. Ets tables and dict/gb_trees give you two different choices with
different properties.

Sorry this became a bit longer than I had originally intended.

Robert

* Yes, there are garbage collection algorithms that incrementally
collect data which would allow you to have LARGE process heaps without
incurring long pause for collection. These are, however, more complex
and costly and it is not clear how much you actually gain by using
them. Multi-core environments don't help either in this respect.

Ulf Wiger

unread,
Sep 14, 2010, 3:40:45 AM9/14/10
to erlang-q...@erlang.org
On 09/14/2010 02:40 AM, Robert Virding wrote:
>
>
> * Yes, there are garbage collection algorithms that incrementally
> collect data which would allow you to have LARGE process heaps without
> incurring long pause for collection. These are, however, more complex
> and costly and it is not clear how much you actually gain by using
> them. Multi-core environments don't help either in this respect.

I think this is one thing that is seldom overlooked. Erlang is a bit
unusual in its use of per-process garbage collection, but this is a
model that fits multicore very well. The enablers are the fine-grained
concurrency together with share-nothing semantics, and the result
is that Erlang can pretty robustly achieve soft real-time
characteristics even when the VM is carrying gigabytes of data,
and still have a relatively simple garbage collector.

BR,
Ulf W

Johan Montelius

unread,
Sep 14, 2010, 5:02:41 AM9/14/10
to erlang-q...@erlang.org
While we're talking ets tables:

Wouldn't it be nice to be able to send table references across nodes?
Today when you create a new tabel you get a table reference that is an
integer. The table can only be accessed by processes in the same node and
god know what will happen if you send the reference over to another node
where there is another table registered under the same number.

Since tables have a mutable state they should be modeled as processess.
When you create a table you get a process identifier. Locally on the node
where the table was created all operations are done as usual i.e. direct
access to the table without the overhead of sending messages. When you
send the process identifier over to another node however, a process
identifier is sent that refers to a proxy process on the originating node.

Using the functions in the ets module using a regular process identifier
would result in messages being sent to that process and replies being
received. Just like accessing a local ets table (slower and we have to
have some error codes if the process is dead).

One could the also open up for a asynchronous access to the table:

Ref = ets:lookup_and_send_me_the_reply(Table, Key),
do_something(),
receive
{Ref, Value} ->

How's that :-)

Johan

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Kenneth Lundin

unread,
Sep 14, 2010, 1:22:01 PM9/14/10
to Johan Montelius, erlang-q...@erlang.org
On Tue, Sep 14, 2010 at 11:02 AM, Johan Montelius <joha...@kth.se> wrote:
> While we're talking ets tables:
>
> Wouldn't it be nice to be able to send table references across nodes? Today
> when you create a new tabel you get a table reference that is an integer.
> The table can only be accessed by processes in the same node and god know
> what will happen if you send the reference over to another node where there
> is another table registered under the same number.

From the very beginning ets-tables almost in the way you suggest i.e
the table identifier was location transparent
and could be used in any process on any node.

OTP R7B released 2000 was the last version that worked like that.
Unfortunately I don't remember exactly why we removed the distribution
support from ets except that it caused more problems than it solved. I
also remember that it was no problem to remove this feature, i.e it
was not much used.

I will check why we removed it and come back.

/Kenneth , Erlang/OTP Ericsson

Max Lapshin

unread,
Sep 18, 2010, 1:25:50 PM9/18/10
to Kenneth Lundin, Johan Montelius, erlang-q...@erlang.org
Lionet (Lev Valkin) helped a lot with excelent solution:

store pids in duplicate_bag with key by state. It happened to be very
complicated solution, but fastest.

Reply all
Reply to author
Forward
0 new messages