ETS retrieve all objects before deleting the table

244 views
Skip to first unread message

Frank Muller

unread,
Jan 17, 2022, 4:02:53 PM1/17/22
to Erlang-Questions Questions
Hey guys,

Is there a way (documented or not) to retrieve all objects from an ordered_set table?

I’d like to retrieve all objects before deleting the table.
What’s the most efficient way to do it?

My table contains roughly ~1 million objects.
Objects are tuple: {non_neg_integer(), pos_integer()}

Thanks
/Frank

Mikael Pettersson

unread,
Jan 17, 2022, 5:50:29 PM1/17/22
to Frank Muller, Erlang-Questions Questions
ets:tab2list/1, or possibly ets:match_object/2 or ets:select/2 (I
haven't benchmarked them).
They are all documented.

($subject mentions ETS, so I assume this is about plain ETS and not
Mnesia ram_copies or something like that.)

Frank Muller

unread,
Jan 18, 2022, 2:46:37 AM1/18/22
to Mikael Pettersson, Erlang-Questions Questions
Hi Mikael,

I  tried them all before posting here.

From fastest to slowest:
1. select/2 (stable numbers)
2. match_object/2
3. first/1 + next/2
4. tab2list

Is there a way to only retrieve keys from an ETS table?
/Frank

Fred Youhanaie

unread,
Jan 18, 2022, 3:34:48 AM1/18/22
to erlang-q...@erlang.org
Hi Frank

For the select/2 and match_object/2 choices you should be able to use a match_spec such as the following:

{ {'$1', '$2'}, [], {'$1'} }

The first element matches your two element ETS object, the third returns the first element only, assuming that's the key.

Cheers,
Fred

Frank Muller

unread,
Jan 18, 2022, 4:08:48 AM1/18/22
to Fred Youhanaie, erlang-q...@erlang.org
Sorry for being unclear. 

Mikael: yes, it’s plain ETS ordered_set table.
Fred: I’m used to match spec and as I you’ve shown, it’s pretty easy to extract the keys. But it’s still slow: ~90millisec for 1M integer keys on my laptop. 

I’m looking for non-documented ways or tricks not found in the doc :-)

Any other tip?

/Frank

Led

unread,
Jan 18, 2022, 5:12:19 AM1/18/22
to Erlang-Questions Questions
> I’d like to retrieve all objects before deleting the table.

Strange task. Table is already "all objects".

> What’s the most efficient way to do it?

ets:give_away/3

--
Led.

Frank Muller

unread,
Jan 18, 2022, 5:51:29 AM1/18/22
to Led, Erlang-Questions Questions
> I’d like to retrieve all objects before deleting the table.

Strange task. Table is already "all objects".

Nothing strange. Just a simple requirement.
Objects are sent elsewhere for processing before deleting the table content. 


> What’s the most efficient way to do it?

ets:give_away/3

That’s strange indeed. How give_away/3 could help? It gives the ownership to another proc.

Led

unread,
Jan 18, 2022, 6:02:08 AM1/18/22
to Erlang-Questions Questions
It actually sends all objects elsewhere (to another proc) for processing.

--
Led.

Frank Muller

unread,
Jan 18, 2022, 7:28:22 AM1/18/22
to Led, Erlang-Questions Questions
No it doesn’t. It only charges the table owner

Leonard B

unread,
Jan 18, 2022, 8:10:38 AM1/18/22
to Frank Muller, Erlang-Questions Questions
I think this opinion stems from confusion about the GiftData argument
in ets:give_away/3.

AFAIK this just passes whatever 'data' you set for the variable, not
the actual table data, through to the recipient of the 'ETS_TRANSFER'
message.

Frank Muller

unread,
Jan 18, 2022, 8:30:00 AM1/18/22
to Leonard B, Erlang-Questions Questions
Indeed Leonard. The option is quite useful but it doesn’t solve my problem. 

Led

unread,
Jan 18, 2022, 8:50:12 AM1/18/22
to Erlang-Questions Questions
>
> I think this opinion stems from confusion about the GiftData argument
> in ets:give_away/3.
>
> AFAIK this just passes whatever 'data' you set for the variable, not
> the actual table data, through to the recipient of the 'ETS_TRANSFER'
> message.

It gives all table data with full access to the process "for
processing". What else do you need?

>
> On Tue, Jan 18, 2022 at 7:28 AM Frank Muller <frank.mu...@gmail.com> wrote:
> >
> > No it doesn’t. It only charges the table owner
> >
> >
> >> >
> >> >
> >> >> > I’d like to retrieve all objects before deleting the table.
> >> >>
> >> >> Strange task. Table is already "all objects".
> >> >
> >> >
> >> > Nothing strange. Just a simple requirement.
> >> > Objects are sent elsewhere for processing before deleting the table content.
> >> >
> >> >>
> >> >> > What’s the most efficient way to do it?
> >> >>
> >> >> ets:give_away/3
> >> >
> >> >
> >> > That’s strange indeed. How give_away/3 could help? It gives the ownership to another proc.
> >>
> >> It actually sends all objects elsewhere (to another proc) for processing.
> >>
> >> --
> >> Led.



--
Led.

Hugo Mills

unread,
Jan 18, 2022, 10:02:14 AM1/18/22
to Led, Erlang-Questions Questions
I may be wrong, but my reading of Frank's requirement was that the
table needs to be serialised to disk in some external format so that
it can be processed by some additional tool outside the current erlang
application. No amount of reparenting to another erlang process in the
same VM is going to accomplish that...

Hugo.
Hugo Mills | All hope abandon, Ye who press Enter here.
hugo@... carfax.org.uk |
http://carfax.org.uk/ |
PGP: E2AB1DE4 |

Frank Muller

unread,
Jan 18, 2022, 12:58:07 PM1/18/22
to Led, Erlang-Questions Questions
Table is already public. give_away/3 brings me nothing. Any process can access it (read/write).

Frank Muller

unread,
Jan 19, 2022, 5:17:54 PM1/19/22
to vale...@pixie.co.za, Erlang-Questions Questions
Hi Valentin,

Have you read what I posted in earlier messages?
I already explored first/next with other alternatives. 

Thanks anyway 

/F.


In addition to serialisation to disk (its:tab2file/2,3), one can always use:

ets:first/1 followed by a number of  calls to ets:next/2


e.g. 
        ets:first( Tab ) -> Key
ets:lookup( Tab, Key ) -> [TableRow] % to retrieve data associated with this key
        ets:next( Tab, Key ) -> NextKey, etc.

NOTE: once the end of the table is reached, instead of a key an atom ‘$end_of_table’ is returned by ets:next/2.


On 19 Jan 2022, at 23:33, vale...@micic.co.za <vale...@pixie.co.za> wrote:

To serialize ETS to a file on the disk, one may use ets:tab2file/2,3.

It does not destroy the table, but simply dumps a content of the table to a file. I’ve used it a number of times and it works as advertised.
I don’t think there’s a tool available to traverse this file without loading it back to ETS (via ets:file2tab/1,2).

However, it may be a start.


V/

Mikael Karlsson

unread,
Jan 20, 2022, 3:09:15 AM1/20/22
to Frank Muller, Erlang-Questions Questions
ets:foldl/3 maybe?

Jesper Louis Andersen

unread,
Jan 28, 2022, 3:35:41 PM1/28/22
to Frank Muller, Erlang-Questions Questions
On Tue, Jan 18, 2022 at 8:46 AM Frank Muller <frank.mu...@gmail.com> wrote:
Hi Mikael,

I  tried them all before posting here.

From fastest to slowest:
1. select/2 (stable numbers)
2. match_object/2
3. first/1 + next/2
4. tab2list


You also need to consider memory usage. When you copy a potentially large ETS table into the heap of a process, you risk blowing up if the table is large. I'd probably safe_fixtable/2 on the table, traverse it with match_object/3 1000 elems at a time or so, then remove the fix when $end_of_table is reached. If any kind of disk I/O is involved, you are likely to be I/O bound on the disk anyway, or your serialization scheme is what is going to slow you down. But because of the table fix, you should be able to work on the disk set in a stable manner.

Clearly, there's some consideration w.r.t if the serialization has to "stop the world" until the table is on disk. Working around that would be my main concern.

Leonard B

unread,
Jan 28, 2022, 4:33:31 PM1/28/22
to Jesper Louis Andersen, Erlang-Questions Questions
IIRC tab2file does basically this, fixtable -> select with limit and a
continuation which writes to the target file.

I'm pretty sure that if there were a quicker (and still "safe" way) to
do this, the code would be using that method.
Reply all
Reply to author
Forward
0 new messages