Web frameworks are normally designed as layers (web server ->
middleware -> front controller -> controller -> action -> view ->
etc.). Data needs to be passed from one layer to another. There are 2
ways to pass:
1. Proplist (environment variables)
2. Process dictionary
The 2nd way:
* Is Simple and natural in Erlang because normally one HTTP request is
processed by one process.
* Makes application code which uses the framework appear to be clean,
because application developer does not have to manually pass an ugly
proplist arround and arround.
I want to ask about the (memory, CPU etc.) overhead of process
dictionary, compared to proplist. Which way should be used in a web
framework?
Thanks.
________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org
Then later to take out the variable in view:
<%= @my_var %>
I think it is not explicit passing.
I don't have experience with web frameworks, but some erlang libraries
use a number of processes "behind the scene", and callbacks might be
executed in a quite different process context, so the process
dictinary is less then useful in these cases.
> * Makes application code which uses the framework appear to be clean,
> because application developer does not have to manually pass an ugly
> proplist arround and arround.
It's the erlang way to carry that kind of stuff around, just think
about the always-present State variable in gen_* callbacks.
Sergej
The process dictionary might seem appealing at first but if you use it
you'll be unable to reason about the data flow just by looking at the
connections across your functions/modules. Debugging stops being a matter of
just looking at your code but and starts requiring that you reason about the
side effects past code interactions might have produced in the process'
current state. Good luck with that. ;)
:Davide
If the data remains constant once the request is parsed, then pass it
around as a dictionary / gb_tree / etc. (or even just a list of {Key,
Value} tuples, which has the fastest look-up time for shortish lists
if you use lists:keyfind).
Now in the case where you need to be making random reads AND WRITES to
data like this, across a number of functions, then I can totally
understand using the process dictionary. When I run into situations
like that I wish I could just write a particular module in Python and
be done with it.
/s
For proplist, is there a trick (macro?) to add syntactic sugar to put and get?
List2 = [{Key, Value} | List]
and
proplists:get_value(Key, List)
are somewhat verbose.
> Hi,
>
> Web frameworks are normally designed as layers (web server ->
> middleware -> front controller -> controller -> action -> view ->
> etc.). Data needs to be passed from one layer to another. There are 2
> ways to pass:
> 1. Proplist (environment variables)
> 2. Process dictionary
>
> The 2nd way:
> * Is Simple and natural in Erlang because normally one HTTP request is
> processed by one process.
> * Makes application code which uses the framework appear to be clean,
> because application developer does not have to manually pass an ugly
> proplist arround and arround.
> I want to ask about the (memory, CPU etc.) overhead of process
> dictionary, compared to proplist. Which way should be used in a web
> framework?
I would strongly advise against using the process dictionary to pass
data between part of an erlang web framework.[1]
You say that "normally one HTTP request is processed by one process" -
by using the process dictionary you require that this is the case for
all code that uses your framework. By making this decision, you are
tying the hands of the users of your framework. They can no longer
choose the process model that suits their problem and must use a one
process per request design. In erlang it's quite common to hand requests
off to other processes, possibly on other nodes, for execution to
balance load, to move computation closer to needed resources, to turn
synchronous tasks into asynchronous ones, to alter process memory
profile, to isolate failures and so on. The use
of the process dictionary precludes all these approaches.
You also say that using the process dictionary "makes application code
which uses the framework appear to be clean". From the point of view of
the maintenance programmer, nothing could be further from the
truth.
Good erlang code does not use the process dictionary. Erlang programmers
usually only have to think about the function body and arguments to work
out what its going to do. Sprinkling 'get' and 'put' through the code
means that an erlang programmer trying to understand your code now has
to read all the code to figure out why something is happening. The order
in which functions are called becomes important. The behaviour of
functions in other modules becomes important because now there's a
back-channel to propagate bugs, er, state between parts of the code.
As a (curmudgeonly) future web framework user, I would almost certainly
not choose a framework based on the use of non-erlangy features[2] such
as the process dictionary firstly because the code would be more difficult to
understand when someone would need to maintain it and secondly because
the process dictionary would prevent me from using a different process
model if I needed to.
Using a proplist or 'dict' or some opaque datastructure and an API
module is the natural, erlangy way to solve your problem.
Good luck with your framework,
--
Geoff Cant
[1] More generally, I would strongly advise against using the process
dictionary.
[2] Ditto for parameterized modules and hierarchal module names[3]
[3] I'm already guilty of this, but promise not to do it again.
1. It makes it difficult if you ever do need to split stuff up into
multiple processes. As soon as you need a "helper" process, all of
your data is inaccessible.
2. It makes it difficult to debug via tracing. You can trace
function calls, tracing changes to the process dictionary is a bit
more hairy.
3. If you want to "hide" the proplist, just pass around some sort of
"request" record. Then you don't see it unless you access the
record. Or use macros and message a "request" process for the data.
4. If you use Mnesia, transactions can be retried, so you may end up
with your mutations happening multiple times if anything happens
inside of a transaction.
I don't deny that Erlang needs some better syntax and conventions for
passing data around (preferably via namespace magic like Ruby or
Python), but the process dictionary can break a quite a few
assumptions and break quite a few useful patterns. This could be fine
if you're the only one writing the code, but you the process
dictionary, in the wrong hands, can make many an Erlanger curse your
name.
--
Jayson Vantuyl
kag...@souja.net
On Oct 29, 6:39 am, Geoff Cant <n...@erlang.geek.nz> wrote:
> Good erlang code does not use the process dictionary.
Note that both gen_server and wx libraries make use of the PD.
/s
>
>
> On Oct 29, 6:39 am, Geoff Cant <n...@erlang.geek.nz> wrote:
>
> > Good erlang code does not use the process dictionary.
>
> Note that both gen_server and wx libraries make use of the PD.
As does Yaws. The fact that a new process spawned by a request handler
process can't access the Yaws data stored in the request handler process's
dictionary sometimes comes up as an issue, but all in all it's not a
frequent problem. Still, Klacke or I will soon be adding a function to Yaws
to copy the necessary data into the process dictionary of a new process to
allow users to avoid this problem, but again, from what I've seen the
problem doesn't come up all that often in practice.
--steve
> On Oct 29, 6:39 am, Geoff Cant <n...@erlang.geek.nz> wrote:
>
>> Good erlang code does not use the process dictionary.
>
> Note that both gen_server and wx libraries make use of the PD.
>
I think you'll find that gen_server makes almost no use of the process
dictionary. The gen_server code itself makes only one reference to the
process dictionary - to pass the pid of the parent process between the
proc_lib:init stage and the gen_server:enter_loop stage through
arbitrary intervening user code.
proc_lib only uses the process dictionary to store the initial_call and
parent pids of the process. The initial_call is only used in exit reports
and the ancestor/parent pid information is used mainly in exit reports,
though also occasionally by appmon to draw the process ancestry tree and
httpc_manager to implement the is_inets_manager() function.
I don't think the minimal use of the process dictionary in either of
these cases undermines the case that good erlang code does not use the
process dictionary. Occasionally you might have to, to work around
historic code that you can't refactor for backwards compatibility
reasons, or for some particular debugging cases. But that doesn't fall
into my definition of good - just the best possible given the
constraints. New erlang code should strive to avoid the process
dictionary if at all possible.
In the case of 'wx', the process dictionary is used to pass around a
#wx_env{} structure - I'm not quite sure why this wasn't just a
parameter instead. Maybe there's something else (like port ownership?)
that makes tying the structure to the process in the process dict make
sense? I'm kinda curious about this one - if there's a good argument to
be made for using the process dictionary, I'll amend or retract my
anti-process-dictionary bigotry :)
Cheers,
--
Geoff Cant
So Yaws will stick with process dictionary?
Actually I am creating this:
http://github.com/ngocdaothanh/ale
It only adds some routing rules and MVC conventions to make Yaws
easier to use. A typical Ale application would have a front controller
behind Yaws' back. Behind Yaws' back, either process dictionary or
proplist can be used, and I want to decide which one to use.
Ngoc.
On Fri, Oct 30, 2009 at 7:12 AM, Steve Vinoski <vin...@gmail.com> wrote:
> As does Yaws. The fact that a new process spawned by a request handler
> process can't access the Yaws data stored in the request handler process's
> dictionary sometimes comes up as an issue, but all in all it's not a
> frequent problem. Still, Klacke or I will soon be adding a function to Yaws
> to copy the necessary data into the process dictionary of a new process to
> allow users to avoid this problem, but again, from what I've seen the
> problem doesn't come up all that often in practice.
>
> --steve
________________________________________________________________
> Steve,
>
> So Yaws will stick with process dictionary?
>
To the best of my knowledge, yes.
> Actually I am creating this:
> http://github.com/ngocdaothanh/ale
>
> It only adds some routing rules and MVC conventions to make Yaws
> easier to use.
What parts are difficult to use?
> A typical Ale application would have a front controller
> behind Yaws' back. Behind Yaws' back, either process dictionary or
> proplist can be used, and I want to decide which one to use.
>
Klacke of course has the final say when it comes to Yaws, but I'm pretty
certain there are no plans to move it away from using the process
dictionary.
--steve
If Yaws aims at the application level, then it lacks the sense of web
application framework like Sinatra, Merb, Rails etc.
If Yaws aims at the web server level, then everything is OK:
* The .yaws file part is like Apache CGI
* The appmods part is like Java Servlet
* Based on the powerful bare bone Yaws provides, higher level
framework like Nitrogen can be *easily* constructed
Ngoc
________________________________________________________________
Using of PD in gen_server is its own internal way to live, I haven't
seen this PD outside and I'm not going
to think about it. But using PD to pass variables from controller
layer to view layer in erlang is a VERY, VERY bad
way of programming. It is a sufficient reason not to use such software
at all, because it is a sign of very bad quality of other code.
Explicit passing of data generated in controller, to view is:
a) clear to understand
b) clear to hook and modify, cache, etc.
c) testable
d) separatable (you may move templates from erlang to other application server)
PD is:
a) unclear, what is required and what is passed to template
b) unmodifieable and unhookable. Business logic migrates to templates
c) very, very hard to test
Sergej
I agree that using PD arbitrarily is bad. But using PD restrictively
is controllable.
A web processing process has some unique properties:
* It is normally short-lived, you want as high req/s as possible right?
* PD are normally propagated one-way. If you want to spawn a new
process, you can clone the PD.
Take Ale for an example, see:
http://github.com/ngocdaothanh/ale/blob/master/src/ale.erl
* Things in PD are set only ONCE (well, app_add_head and app_add_js
are accumalative).
* Keys in PD are normally literal. You can easily track back where a
value is set.
* Keys in PD are namespaced, i.e: {app, title}. Things of Yaws, things
of Ale, things of the app in PD are separated.
* When you want to put a thing in PD from a controller, you use
ale:app(Key, Value). When you want to get the thing out from a view,
you use ale:app(Key). You don't spam the PD arbitrarily, you use the
designated API.
This way, PD is like environment variables of a Linux shell session.
The environment is not for sharing mutable things, it is used for
setting things.
Ngoc
The environment may be transfered implicitly inside the PD along the way.
I don't know if this is relevant to this discussion, but I am creating
this app which uses Ale which uses Yaws which uses PD:
http://github.com/ngocdaothanh/khale
Nitrogen is another example of the (successful?) use of PD.
Ngoc
Damn, You look at mutable Ruby and try to implement in immutable
Erlang the same thing.
after_filter in Rails can modify anything, including internal hidden
instance variables.
What can you do with set-once variables in Erlang?
My argument for having the 'env' in the process dictionary was that it
is static,
and that most of the calls in wx are prefixed with an 'TheObject' as
well, so I didn't
want the user to keep sending both the 'env' and 'This' in all calls,
wxFrame:setTitle(Env, TheObject, "My window Title"),
vs C++
TheObject->setTiltle("My Window Title"),
I could have put the 'env' in each object reference, but that would
have increased every
object reference in erlang with two words, also for static calls which
don't work on an
object I would then have to add an 'env' parameter, which isn't there
in the C++ lib.
/Dan
For example:
* In the controller you set-once an article in PD:
http://github.com/ngocdaothanh/khale/blob/master/removable/article/c_article.erl
* In the view you take out the article and render:
http://github.com/ngocdaothanh/khale/blob/master/removable/article/v_article_show.erl
The philosophy behind "set-once" is that you need to pass things in
only one way, from a layer to layers behind it. For this purpose PD is
a perfect medium, and its its implicitness would make the code less
verbose. As a framework developer, I think it would make for app
developers happy because their code would be less verbose.
When you want to start Eshell, would you want to type "erl" or the
full path to "erl"?
Ngoc
I feel that you want to implicitly say that you find no use in PD and
it should be removed from the next version of Erlang.
This is not related to the process dictionary. If the programmer
implements e.g. a handle_call in gen_server 'A' (or anything that's
called from that handle_call), he has to make sure that he doesn't
call gen_server 'B' if there's a chance that 'B' called 'A' first -
otherwise there would be a deadlock. In this case also a lot of other
code gets important and it's fairly common that a handle_call (or a
function called from handle_call) gets implemented...
Like it or not, it's important that in what circumstances a function
is called - this is a limitation in Erlang's "functional
languageness", but actually this is necessary to do anything useful
with the language.
The process dictionary could be great for environment-like variables,
which are only set once, but used in very many places and it's very
inconvenient to pass around one more parameter. They don't show up in
function traces - but they do show up in the output of
erlang:process_info(), where e.g. the gen_server state is not shown,
even though it would be dead useful.
Well if you would like to see the gen_server's state you can always do
a sys:get_status/1.
Regards,
Tamas
Tamas Nagy
Erlang Training & Consulting
http://www.erlang-consulting.com