[erlang-questions] Process Dictionary limitations??

121 views
Skip to first unread message

Charles Hixson

unread,
Oct 10, 2012, 6:55:46 PM10/10/12
to Erlang-Questions Questions
I'm choosing a language to implement a ... well, neural network is
wrong, and so is cellular automaton, but it gives the idea. Anyway, I'm
going to need, in each cell, a few stateful items, e.g. activation level.

When I look at what Erlang can do, I see that the Process Dictionary
looks as if it would serve my needs, but then I am immediately warned
not to use it, that it will cause bugs. These stateful terms will not
be exported from the cell within which they are resident. Is this still
likely to cause problems? Is there some better approach to maintaining
state? (I can't just generate a new process, because other cells will
need to know how to access this one, or to test that it has been rolled
out.)

--
Charles Hixson

_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Michael Truog

unread,
Oct 10, 2012, 7:21:29 PM10/10/12
to Charles Hixson, Erlang-Questions Questions
On 10/10/2012 03:55 PM, Charles Hixson wrote:
I'm choosing a language to implement a ... well, neural network is wrong, and so is cellular automaton, but it gives the idea.  Anyway, I'm going to need, in each cell, a few stateful items, e.g. activation level.

When I look at what Erlang can do, I see that the Process Dictionary looks as if it would serve my needs, but then I am immediately warned not to use it, that it will cause bugs.  These stateful terms will not be exported from the cell within which they are resident.  Is this still likely to cause problems?  Is there some better approach to maintaining state?  (I can't just generate a new process, because other cells will need to know how to access this one, or to test that it has been rolled out.)

This explains some basics about the process dictionary: http://www.erlang.org/course/advanced.html#dict
Quoted below:
  • Destroys referential transparency
  • Makes debugging difficult
  • Survives Catch/Throw
So, it is much better to use variables, so side-effects are more explicit (i.e., function variables).  This is the equivalent to the State variable of a gen_server behaviour (http://www.erlang.org/doc/man/gen_server.html).  Depending on the expected state-handling, you might want a gen_server, a gen_event, or a gen_fsm for each cell.  Otherwise, if you want to avoid OTP behaviour usage, you could just do plain Erlang code, but your code might then be more error-prone (especially since you are asking this question).

Charles Hixson

unread,
Oct 10, 2012, 9:40:29 PM10/10/12
to Michael Truog, Erlang-Questions Questions
Thank you.  That confirms the recommendation against using the process directory, though I will admit that I can't see any way that your proposed alternatives could replace it.

-- 
Charles Hixson

Michael Truog

unread,
Oct 10, 2012, 10:07:58 PM10/10/12
to Charles Hixson, Erlang-Questions Questions
Usually the State variable in the OTP behaviours is always a record, which is preprocessing syntactical sugar for tuples (similar to a struct in C, conceptually).  So, it provides constant time (O(1)) access to tuple elements based on field names (when reading elements, setting elements has more overhead... all memory is copied since variables are immutable in Erlang).  If you need dynamic field names, there are many options for various key-value data structures.  In OTP, dict, gb_trees, orddict (and array if your key type is always an integer).  If you need to use dynamic strings as key values, I have a trie here https://github.com/okeuday/trie.  However, it is best (most typical and maintainable) if you can rely on a record (tuple) for process state.  Within the state record, you can always define any dynamic lookups you might need or other data structures you wish to utilize.

Hope this helps make it clear that the key-value data structures available in separate modules (either within OTP or external to OTP) help to make sure you can create key-value lookups within Erlang code without utilizing the process dictionary.

The other direction you can go is to use ets which provides global storage in Erlang, managed by a process.  However, it is best to avoid global variables whenever possible.



Richard O'Keefe

unread,
Oct 11, 2012, 12:31:44 AM10/11/12
to Michael Truog, Erlang-Questions Questions
You can think of the process dictionary this way:

If there is a function clause

f(X, Y, Z) -> A = g(X, Y), B = h(Z, A), q(A, B)

replace it by

f(X, Y, Z, D0) ->
{A, D1} = g(X, Y, D0),
{B, D2} = h(Z, A, D1),
q(A, B, D2)

and replace

get(K)

by

{get(K, D), D}

and
put(K, V)
by
{V, put(K, V, D)}

I've omitted exception handling, but it's not actually
all that different. Tedious rather than difficult.

So there's an important sense in which it doesn't spoil the
functional purity of the language. (I got this idea from a
Xerox blue-and-white CS technical report giving a functional
semantics for Euclid.) There are bad things that can happen
in imperative languages that still can't happen in Erlang.

One big bad thing *can* happen. If you call a function you
haven't read, you do not know what it is going to do to the
process dictionary. It could change the value associated
with any key; it could delete any key; it could add any
key->value mapping. Even specifications don't help,
because although there have been several type systems that
include effects, the type system Erlang now uses is not one
of them.

Attila Rajmund Nohl

unread,
Oct 11, 2012, 5:20:52 AM10/11/12
to Charles Hixson, Erlang-Questions Questions
2012/10/11 Charles Hixson <charle...@earthlink.net>:
> I'm choosing a language to implement a ... well, neural network is wrong,
> and so is cellular automaton, but it gives the idea. Anyway, I'm going to
> need, in each cell, a few stateful items, e.g. activation level.
>
> When I look at what Erlang can do, I see that the Process Dictionary looks
> as if it would serve my needs, but then I am immediately warned not to use
> it, that it will cause bugs. These stateful terms will not be exported from
> the cell within which they are resident. Is this still likely to cause
> problems? Is there some better approach to maintaining state? (I can't
> just generate a new process, because other cells will need to know how to
> access this one, or to test that it has been rolled out.)

I don't really understand why you can't generate a new process for
each cell - just send a message to the neighbouring cells that there's
a new cell. I think each cell needs to know its neighbours anyway.

Eric Newhuis

unread,
Oct 11, 2012, 9:14:18 AM10/11/12
to Richard O'Keefe, Erlang-Questions Questions
Heheh when may we expect the Emacs refactoring macro for this? LOL

Dmitry Belyaev

unread,
Oct 11, 2012, 9:48:34 AM10/11/12
to Erlang-Questions Questions
Isn't this feature, I mean process dictionaries, much worse for functional nature of Erlang than parameterized modules?

Why aren't there any words about removing pdics but there are about functional pmods?

--
Dmitry Belyaev

Bengt Kleberg

unread,
Oct 11, 2012, 10:15:02 AM10/11/12
to Erlang-Questions Questions
Greetings,

The difference is that the process dictionary is not an experimental
feature.



bengt

On Thu, 2012-10-11 at 17:48 +0400, Dmitry Belyaev wrote:
> Isn't this feature, I mean process dictionaries, much worse for functional nature of Erlang than parameterized modules?
>
> Why aren't there any words about removing pdics but there are about functional pmods?
>

Charles Hixson

unread,
Oct 11, 2012, 11:48:07 AM10/11/12
to Attila Rajmund Nohl, Erlang-Questions Questions
On 10/11/2012 02:20 AM, Attila Rajmund Nohl wrote:
> 2012/10/11 Charles Hixson<charle...@earthlink.net>:
>> I'm choosing a language to implement a ... well, neural network is wrong,
>> and so is cellular automaton, but it gives the idea. Anyway, I'm going to
>> need, in each cell, a few stateful items, e.g. activation level.
>>
>> When I look at what Erlang can do, I see that the Process Dictionary looks
>> as if it would serve my needs, but then I am immediately warned not to use
>> it, that it will cause bugs. These stateful terms will not be exported from
>> the cell within which they are resident. Is this still likely to cause
>> problems? Is there some better approach to maintaining state? (I can't
>> just generate a new process, because other cells will need to know how to
>> access this one, or to test that it has been rolled out.)
> I don't really understand why you can't generate a new process for
> each cell - just send a message to the neighbouring cells that there's
> a new cell. I think each cell needs to know its neighbours anyway.
>
Because lots of other processes would have links to the cell that were
the process id. (There isn't really any other way to link to an active
process.) It doesn't know who has these links. "Neighbors" isn't the
right way to think about it, as they aren't neighbors in any meaningful
sense of the term. (The links are one way.)

Perhaps I should have called it a weighted directed graph, but that
isn't quite the right model either. But with that analogy the weights
need to be adjustable. Or I could have called it a neural net, but that
also isn't quite the right model, at least as I understand it.

Each cell needs to know the cells that it links to, and the cell that
most recently linked to it. And it's activation level. And the weight
of each link to a "following" cell. The activation levels and weights
need to be able to change, but should not be visible outside the cell
(though the cell should be able to receive messages that causes it to
change these values).

It's also possible that eventually I might need some "regional values",
that would do things like adjust the sensitivity of all cells in a
region to being activated, but I can see a way to do that with message
passing (though it does add significantly to the overhead). OTOH, maybe
I'll never need these "emotional variations". (So far I don't have a
well-defined idea of what "region" would mean.)

--
Charles Hixson

Max Lapshin

unread,
Oct 11, 2012, 12:01:55 PM10/11/12
to Dmitry Belyaev, Erlang-Questions Questions
On Thu, Oct 11, 2012 at 5:48 PM, Dmitry Belyaev <be.d...@gmail.com> wrote:
> Isn't this feature, I mean process dictionaries, much worse for functional nature of Erlang than parameterized modules?
>
> Why aren't there any words about removing pdics but there are about functional pmods?
>

Damn!

Why do you want to remove any feature, you are scared with?
Are you using some library that is using pdict and it spoils your
life? I think no.

Process dictionary is very, very convenient tool. It is the only way
to get information from process without sending message to it. It is
the only way to get information about process, whose message queue len
is more than several thousands of messages and you ask, when OTP will
remove it?

Michael Truog

unread,
Oct 11, 2012, 12:58:35 PM10/11/12
to Charles Hixson, Erlang-Questions Questions
You mentioned the variable number suffix as an emacs problem, but there is a parse transform that can help hide the variable number suffixes for you, here: https://github.com/spawngrid/seqbind . However, it is probably best to avoid parse transforms while learning Erlang, since the source code may not match examples.

The weighted directed graph or neural net type of application is probably best done with an Erlang process per cell with messages inbetween. That approach matches an actor model, and Erlang. A tutorial that should be helpful is here: http://www.trapexit.org/Erlang_and_Neural_Networks

Dmitry Belyaev

unread,
Oct 11, 2012, 1:55:17 PM10/11/12
to Max Lapshin, Erlang-Questions Questions
I don't want any feature to be removed. Process dictionaries are useful sometimes. So are parameterized modules.

The fact that sometimes people use those features improperly doesn't mean the feature is bad.

I hope no features we currently have in the language will be removed. Even experimental ones.

--
Dmitry Belyaev

Richard O'Keefe

unread,
Oct 11, 2012, 9:53:25 PM10/11/12
to Dmitry Belyaev, Erlang-Questions Questions

On 12/10/2012, at 2:48 AM, Dmitry Belyaev wrote:

> Isn't this feature, I mean process dictionaries, much worse for functional nature of Erlang than parameterized modules?

"Much worse"? I'm not sure how you would measure that.

Processes and message passing (which let you simulate
shared mutable variables) clearly cancel out properties
that you would otherwise expect a functional language
to enjoy, while the process dictionary and parameterised
modules do not.
>
> Why aren't there any words about removing pdics but there are about functional pmods?

Because the process dictionary interface has been a
documented official this-isn't-going-away part of Erlang
for a long time, whereas parameterised modules have
always been this-is-experimental-use-at-your-own-risk-
but-we-TOLD-you sort of thing.

Modules with parameters in ML ('functors') are a very
powerful structuring tool, but they go with a number of
things Erlang is lacking, like nested modules, module
types ('signatures'), and a formal semantics for the
things, and still had several major problems that required
repeated redesign. It may not be without significance
that the Haskell designers, willing to experiment boldly
in so many ways, refuse to take one tiny step in that
direction.

The ability to use frames as quasi-modules (if/when we get
frames) will provide an alternative.

Charles Hixson

unread,
Oct 11, 2012, 11:35:50 AM10/11/12
to Erlang-Questions Questions
I thought process dictionaries were "per process". If so, and I don't
export the values (which I don't intend to), then how would any other
part even have access to them?

What these would be used for is things like adjusting the weight of a
weighted graph cell, or deciding that a cell was too inactive, and
should be rolled out. (I haven't worked out just how to do that,
wanting to decide first if I should learn Erlang, or look for another
language that could do what I need.) In analogy to C, they would be the
equivalent of a static local variable. Or perhaps a C++ class private
variable is closer.

--
Charles Hixson

Max Lapshin

unread,
Oct 12, 2012, 2:22:48 AM10/12/12
to Charles Hixson, Erlang-Questions Questions
No. Process dictionary is readable outside of process.

It should be used only as a well-known hack where you cannot work without it.
For example, it is the only way to collect information about busy process.

Think about process dictionary as of the way to make labels for processes.

For example in erlyvideo it is used to mark processes like:
put(flu_name, {stream, <<"stream1">>}) and thus I can understand, that
it is this problematic stream, that consumed 5 GB of ram.

Richard O'Keefe

unread,
Oct 12, 2012, 3:28:18 AM10/12/12
to Charles Hixson, Erlang-Questions Questions

On 12/10/2012, at 4:35 AM, Charles Hixson wrote:
>
>>
> I thought process dictionaries were "per process".

They are.

> If so, and I don't export the values (which I don't intend to), then how would any other part even have access to them?

I'm not sure what you mean by ``export''.
process_info(Pid, dictionary) will let you read the process dictionary
of another process. This is of course meant as a debugging tool, and
if it bothers you,
process_flag(sensitive, true)
will let you say ``this process's dictionary is NOT to be revealed.''

The transformation I sketched didn't allow for debugging tools.

>
> What these would be used for is things like adjusting the weight of a weighted graph cell, or deciding that a cell was too inactive, and should be rolled out. (I haven't worked out just how to do that, wanting to decide first if I should learn Erlang, or look for another language that could do what I need.) In analogy to C, they would be the equivalent of a static local variable. Or perhaps a C++ class private variable is closer.

Since static local variables and class private variables are per-program,
not per-process, neither would seem to be a good analogue.

I get this uneasy feeling that you are letting language features drive
design.
Reply all
Reply to author
Forward
0 new messages