This means that with respect to concurrency you need to consider Shen
as a procedural system in terms of handling side-effects.
This means that you need to consider the propagation of side-effects
and ordering constraints upon operations.
For this reason I do not think that it is useful or correct to think
about concurrency in Shen in functional terms.
A process is a (virtual) machine.
Qi defines a virtual machine for Qi code to run in, so this is our
starting position.
Qi defines a basic input port via (input) and an output port via
interleaved (print) and (output) operations.
We could add multiprocessing trivially by adding an optional port
parameter to these three operations, a new operation to establish a
port to a Qi process address, and an operation -- e.g., select -- to
inform the process about the set of ports with pending inputs to be
read.
e.g., (set *port* (make-port "abc")) (print [+ 2 3] *port*) (print
(input *port*))
If there is no process existing at an address, then one can be generated.
The default Qi process could then effectively (set *default-port*
(first (select))) and everything could proceed normally.
Upon exit the process would be destroyed, leaving the address vacant,
and any undelivered message could generate a new process to handle it.
That would be the least intrusive approach to multiprocessing in Qi
that would scale that I can see.
Coroutines and futures would complement multiple processes nicely.
Coroutines would allow the interleaving of operations to support the
demands of multiple i/o ports.
Futures would allow multiprocessing within a process, but the
requirement to maintain the dependency ordering of side-effects makes
this more complicated.
Concurrent shared memory systems (e.g., conventional threads) don't
scale and make every memory operation implicitly contentious (unless
you can prove that it can't involve shared memory). The verdict of the
last decades seems to be that this is a bad idea, and will only get
worse as distributed systems mature.
Anyhow, that's my two cents.
> --
> You received this message because you are subscribed to the Google Groups "Qilang" group.
> To post to this group, send email to qil...@googlegroups.com.
> To unsubscribe from this group, send email to qilang+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/qilang?hl=en.
>
>
Well, it's fairly central to concurrency -- so you need to work out if
we're talking about Qi or a purely functional subset of it.
>
> The reason I am asking the question 'What is a process?' is because
> unless it can be formally defined then the proper treatment of a
> process is unresolved. I think it should be more than just a question
> of a few primitives; if processes are to be passed around we should
> know what we are talking about. Not answering this question will
> cause problems later.
>
>> A process is a (virtual) machine.
>
> OK. If we say that a process is a virtual machine - how do we define
> a virtual machine? The AUM in FPQi is a virtual machine. It is
> specified by a BNF and a mapping from Prolog into the AUM and from the
> AUM to Lisp. It is an abstraction layer between Prolog and the
> underlying platform. Is a process really like this?
Yes, I think so.
Consider a posix process.
It is a virtual machine that runs a program in a language that is
generally a subset of the instruction set of the underlying
architecture, which communicates with the outside world via a syscall
interface.
A process is fundamentally a machine that interprets a program.
The AUM specifies a machine which could be implemented as an interpreter.
I don't think this is fundamentally different to the posix process case.
Normally you pair the AUM with a specific program and translate that
into a specific program in another language (i.e., compilation), but
that's functionally equivalent to running that AUM program in a CL
interpreter.
Likewise you have the C Abstract Machine that specifies the machine
that strictly conforming C programs run in.
In all cases we have realizations of virtual machines -- some general
and some specialized upon particular programs.
>I think it should be more than just a question
>of a few primitives; if processes are to be passed around we should
>know what we are talking about. Not answering this question will
>cause problems later.
>
>> A process is a (virtual) machine.
I really want abstractions that allow me to safely and
efficiently use multiple threads working in one address space
(e.g. what Clojure is focused on solving). The failure of
speed (raw or architectural) to scale with Moore's law means
multiple core SMP systems are for the foreseeable future
where the action is and for Shen to be competitive in this
environment it needs an approach to concurrency that
addresses this.
The classical Actor model, which is what I think Brian
Spilsbury is advocating, certainly solves the safety issue
but at a great cost for many problems that can fit into one
SMP system since it requires copying data into messages.
While it doesn't solve the question of what you are asking, I
suspect a purely functional approach should be considered;
certainly without one the utility of a Shen port to Clojure or
any other seriously functional language is questionable.
On the other hand there's the question of what to do about
ports to languages rife with side effects. As Brian correctly
points out, "The verdict of the last decades seems to be that
this is a bad idea, and will only get worse as distributed
systems mature." E.g. witness the general failure to graft
Software Transactional Memory (STM) onto imperative languages
vs. the relative ease of adding it to Haskell or Clojure being
built on it.
- Harold
As I see it, a function is an invariant mapping from domain to range.
Being invariant, it exists outside of time, which removes it from the
domain of concurrency which is concerned with operations that occur
over time.
Which means that in this conversation we must be talking about not
functions but about machines used to compute functions, which I think
is a very important distinction.
>> It is a virtual machine that runs a program in a language that is
>> generally a subset of the instruction set of the underlying
>> architecture, which communicates with the outside world via a syscall
>> interface.
>>
> Well is it? I mean suppose I have a background program that prints
> "hello world" if a specific file is created. Is that program a
> virtual machine? Or even simpler, it counts up to 10^10 while I'm
> typing and then prints the same. I would say it is not; certainly not
> in the usual sense of the phrase.
The program is not a virtual machine, but the machine running the
program is a (virtual) machine.
One clue that you're not talking about a functional system is that
you're saying "while I'm typing".
Here you must be talking about the machine running the program, since
that's the only thing that can be doing something while you're
talking.
In a functional system, we could model the above by constructing a
list encoding the problem, and then using a lazy list that somehow
depends on that first list for the result -- i.e., as a functional i/o
mechanism.
>
> But if we were to stretch the phrase in this way, then we really would
> be making the phrase "virtual machine" mean what "process" or
> "external process" means and we would have made little advance in
> saying a process is a virtual machine. What I'm after is a
> characterisation that takes us deeper than swapping one name for
> another and points the way for the correct treatment of processes.
The important point here, in my opinion, is distinguishing between the
program (which might be functional) and whatever machine is used to
compute that program (which will not be functional).
Since functions do not involve time we are limited in the approaches
that we can take here.
(a) We can have a translation process determine where concurrency can
be added without affecting the semantics of the program (which more or
less limits you to implicit futures).
(b) We can try to functionally model time in some way, and have the
translation process use this model to choose concurrency strategies --
which seems difficult.
(c) We can add operations with functional semantics that are used as
directives by a translation process without modeling time -- e.g.,
having a function equivalent to identity that tells the translator
that it should use a future here -- which seems inelegant.
(d) We can abstract other computational devices as data streams and
turn this into essentially an i/o problem which is equivalent to IPC.
Personally, I think that (a) and (d) make the most sense, and these
can be used in conjunction.
The problem is that multicore SMP systems don't scale, either.
And with a shift toward cloud computing I think their dominance will
be extremely short lived.
I regularly use hundreds of machines in order to do large scale
computations and data processing.
While that might be a little expensive for the average user, cloud
providers like Amazon are making it increasingly cheap to simply rent
hundreds of machines on a demand basis.
They do up to a point, which will continue to increase for
some time. I submit that a lot of problems fit within the
limit of what can be economically put on one motherboard.
>And with a shift toward cloud computing I think their dominance will
>be extremely short lived.
When do you think cloud computing vendors (or sites using their
approach) will stop building their clouds on top of SMP systems?
If we look at Amazon's AWS EC2 offerings, every one of them
other than the Micro or Small instance offers 2 or more cores.
>I regularly use hundreds of machines in order to do large scale
>computations and data processing.
>
>While that might be a little expensive for the average user, cloud
>providers like Amazon are making it increasingly cheap to simply rent
>hundreds of machines on a demand basis.
Indeed, and it would be great if Shen addresses this capability in
a native and powerful way. My point is that since it's functional
and some of its target platforms are also functional we should see if
we can also address the SMP problem/opportunity.
- Harold
Isn't this just equivalent to partial evaluation?
Actually, I think there is a terminology gap that will require
examples to bridge.
Can you provide examples of this paralysis and deafness, respectively?
> ** It appears then, that the dilemma we face in understanding how to
> represent processes is a result of current functional programming
> practice and does not arise from the lambda calculus. Functional
> programming has created paralysed abstractions called closures which
> do not sustain execution within themselves. But in lambda calculus we
> find that abstractions are as fully 'unparalysed' as applications.**
> UNQUOTE
Can you provide examples of this paralysis, since I evidently don't
understand what it is?
>
> In my study I call these evaluable abstractions 'lambda processes'.
>
> * A process is a lambda process.
> * A lambda process is an abstraction which can sustain evaluation
> within its body.
> * A lambda process behaves in most important respects like an
> abstraction.
What does it mean to sustain evaluation?
Are you referring to something along the lines of a coroutine?
>
> There is a strong correlation between lambda processes and processes
> in languages like Termite Scheme and Erlang. Effectively if we equip
> a language with lambda processes and include within these processes
> some primitives for handling events, you have, IMO, a model for
> concurrency.
Can you provide an example of two lambda processes interacting?
It's always trivial to deal with an additional core as another machine
that is cheap to talk to.
Dealing with remote machines as additional cores is tricky.
I guess that's the point I want to make with respect to scaling.
vm is calculating lambda and have next states:
1. Running
2. Waiting for i/o
3. Stopped (because of error or finished in ordinary way)
i/o may be done via message passing, using next base primitives:
1. recv - blocking
2. send - non-blocking
3. peek - like recv, but non-blocking
or we can do i/o via stdin, stdout and use functions for streams
Some primitives to work with vm as first order object:
1. (run (lamdba (stdin stdout) ... )) (stream --> stream --> A) --> [vm A]
2. (stop VM) [vm A] --> boolean
3. (vm? vm) A --> boolean
3. (vm-state? vm) [vm A] --> vm-state, where vm-state one of: running,
error, finished
4. (vm-result? vm) [vm A] --> A, throws exception on vm in error state
and blocks when vm is running
If we introduce vm like first order object then we can get type secure
concurrency.
VM in this case just an abstraction and all low-level VM machinery may
be implemented in many ways in underlying platform.
For instance, VM could be implemented like ordinary OS process running
CL with Shen image with TCP/IP communication channels.
Or VM could be implemented using threads and shared memory. Or VM could
be built on top of green threads in Scheme.
Later a low-level protocol for communication between VMs may be defined
to allow interoperability between VMs running on different host systems.
Vasil.
This sound like a future.
> Lambda processes can do more than this if you allow process variables
> to occur in the body - thus in (/! X Y) X is the process variable.
> The trick then is to enrich these processes by a notation that allows
> the programmer to do the necessary conjurations; such creating
> interruptable event loops and processes with stateful memory.
This sound like a combination of a future and a coroutine.
Is this accurate?
These are normally referred to as "futures".
You can consider a side-effect as being like unbound variable, which
can be bound by attempting to use the result of the expression.
So (blow-up-the-powerstation) might well be able to proceed with all
of the functional ancillary work involved in that task, and then block
on the detonation until the future is realized by attempting to get
its value.
>
> Based on an operator /! for lambda processes and two primitives while
> and bound? I have found that:
>
> 1. Lambda processes can represent parallelism e.g. parallel or.
> 2. They can represent event loops that can be terminated on command.
> 3. They can be set up to have memory of past events.
> 4. There is a mapping of this model into message passing systems
> like Erlang and Termite Scheme.
This makes a lambda process sound very much like a Qi interpreter ...
which is what I was talking about earlier.
One thing that I think has been neglected is the locality of a process
-- not all resources are equally available to all processes in
concurrent systems, which is why I suggested using a system of
addresses.
Since it is clear that we are now talking about an essentially
procedural system, we need to consider limits on the propagation of
side-effects -- can side-effects be propagated between processes
implicitly or only as messages?
I think that both approaches are useful, but produce qualitatively
different results -- where I want optimistic evaluation (e.g.,., a
future), then implicit propagation of side-effects is useful and
necessary in a procedural system.
Where I want to deal with effectively a remote Qi interpreter, then I
think that the implicit propagation of side-effects would be a very
bad idea.
There is a third case, which corresponds to generators or coroutines
-- where I want to be able to do a partial computation, yield a
result, then later continue possibly with additional input.
Maybe the distinction here can be between processes with addresses and
processes without addresses -- the address implying a limit to the
propagation of side-effects, requiring messaging to escape?
I think that lambda process bodies should not be distinguished from
ordinary lambdas.
> c. the nature of applying a lambda process to an argument. What
> happens?
(let Proc1 (/! X (while (unbound? X) (thread-yield!) (+ 1 X)))
(let Result1 (Proc1 123)
(Result1 0)))
First call of lambda process passes argument to process (X becomes
bound) and returns lambda which is called when we want to get result of
process.
> /!
> while
> bound?
It is insufficient. There is need of synchronization primitive like
wait-until-bound.
(let Proc1 (/! X (do (wait-until-bound X) (+ 1 X)))
(let Result1 (Proc1 123)
(Result1 0)))
There is another questions:
1. how to deal with variables captured by process body?
(let A 1
(let P (/! X (+ X A))))
treat captured variables like captured variables in ordinary lambdas?
can process modify captured variables?
2. can process access global variables using (value Var) ?
can process write to global variables?
Vasil
(/! X (while (unbound? X)
; optimistic evaluation -- wait
(look-for-event)
; yield the final result, which is another lambda process,
to which we transfer control.
(if (= 1 3) halted ((event-loop-closure) (+ 1 1)))))
(define repl
-> (repl-help (event-loop-closure 1)))
(define repl-help
L-process -> (let I (do (output "~%>") (input))
(if (= I exit)
(repl-help (if (= L-process halted) halted (L-process 0)))
(do (print (eval I)) (repl-help L-process))))))
So L-process has either already yielded its final result (which turns
it into halted, effectively), or we bind X in it to 0, causing it to
yield as its final result either halted or another lambda process.
In this scheme it appears that we cannot access the result of anything
in the 'optimistic evaluation' section, so there is never any point in
putting anything other than (look-for-event) there.
This looks similar to using futures that are forced by application,
and do nothing until forced -- is this fair?
I'd like to suggest a more practical example -- asynchronous i/o.
In my application I know that I will need the result of a database
lookup, but I have something to do before I can make use of it, so I'd
like the database lookup to be able to run in parallel while I'm doing
this.
I can see how I could use futures or a lazy list to achieve this, but
how could I use a lambda process to do it?
e.g., (using a made up lisp)
(let ((io (future (look-up-database-key "foo"))))
(let ((value (do-something)))
(+ value (force io))))
The next concern that I have is that since the lambda process does not
retain identity between steps, it is not possible to have a three-way
conversation with a process -- this is also true of lazy lists --
since the last reply becomes essentially a capability for continuing
the conversation.
I suspect that this makes this model almost useless for anything other
than what futures do, but I'm probably missing something.
So, as a second practical example, how could I implement the database,
as a lambda process, that I looked up in the first example, such that
it could service multiple clients, and such that they could use it to
hold an indirect conversation?
I suspect that you would need to do it as you would for lazy lists,
and have them conspire via side-effect with some central location.
But since this model does not represent processes as machines, there
is no-where for those side-effects to be contained within ...
Yes, but I think there is one part missing -- address -- a mechanism
by which we can locate a machine, and by which we can estimate the
cost for a given machine to perform some action (since not all
machines have equal access to all resources).
We can also use addresses to handle basic resource consensus -- as we
use dynamic variables in CL to do.
If sending a message to an address where there is no machine currently
can (optionally) establish a machine at that address to receive that
message, then we can use this to establish agents for coordination --
a trivial example would be a semaphore machine.
I think that a quintuple <A, S, I, O, E> is probably sufficient to
describe a qi-machine.
The critical point of a thinking in terms of machines is that it acts
as a limit upon the ability to reason about side-effects, and requires
side-effects propagating beyond that limit to be explicitly expressed
in terms of messages -- and we cannot reason about messages in the
same way because messages are fundamentally unreliable.
> Mark
>
> --
> You received this message because you are subscribed to the Google Groups "Qilang" group.
> To post to this group, send email to qil...@googlegroups.com.
> To unsubscribe from this group, send email to qilang+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/qilang?hl=en.
>
>
In the case of uniprocessor and multiprocessor systems we normally
consider these reliable, because they generally fail as a whole (e.g.,
our process or machine dies).
In this case the semantics of message passing can be equivalent to CPS
application or assignment -- we don't need to consider failure because
if it fails, so will everything else.
In the case of a message between machines, we need to consider the
possibility that the message was corrupted, not delivered, or that it
was delivered but the other system failed.
Within a reliable system we can use transparent mechanisms such as
futures, and we need to consider the propagation and ordering of
implicit side-effects (if any).
Across an unreliable system we need to make communication explicit in
order to support failure.
I'm not sure that trying to unify these is a sensible idea.
This is fundamental, not superficial.
while it can indeed add up, i would hazard to guess that in the
Clojure world it is seen as more important to be flexible and clean at
a higher level than to be hampered semantically but faster?