Each node running the same executable?

jer...@jeremyhuffman.com

unread,

Jun 20, 2013, 8:11:28 PM6/20/13

to parallel...@googlegroups.com

Hi, I've browsed your docs a few times and I thought I remember seeing a reference to this but couldn't find it now. I'm trying to understand if each node needs to be running the exact same executable, or if there is binary compatibility for data structures assuming types stay the same. Can you explain how this works?

Thanks,

Jeremy

Jeff Epstein

unread,

Jun 20, 2013, 8:21:21 PM6/20/13

to jer...@jeremyhuffman.com, parallel...@googlegroups.com

In a nutshell: the only substantial requirement is that the remote
table and message serialization have to be the same in all binaries
that communicate with each other. This means that, with care, you can
use the same library code in different binaries, and they will be able
to send messages and closures.

A caveat: one should be careful to avoid using any architecture
dependent types (e.g. Int) in messages, since that type could have
different sizes on 32-bit and 64-bit machines. Use fixed-size types
instead, like Int32.

These requirements may change if the language is extended to natively
support Static.

> --
> You received this message because you are subscribed to the Google Groups
> "parallel-haskell" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to parallel-haske...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

Reid Draper

unread,

Jun 20, 2013, 9:18:27 PM6/20/13

to parallel...@googlegroups.com, jer...@jeremyhuffman.com

On Thursday, June 20, 2013 8:21:21 PM UTC-4, Jeff Epstein wrote:

In a nutshell: the only substantial requirement is that the remote
table and message serialization have to be the same in all binaries
that communicate with each other. This means that, with care, you can
use the same library code in different binaries, and they will be able
to send messages and closures.

A caveat: one should be careful to avoid using any architecture
dependent types (e.g. Int) in messages, since that type could have
different sizes on 32-bit and 64-bit machines. Use fixed-size types
instead, like Int32.

These requirements may change if the language is extended to natively
support Static.

Any chance that native support for Static would tighten these requirements? For many of the problems I'm interested in, having to have the same binary on all nodes would be a deal-breaker. I think the same would be true of anyone wanting to build loosely-coupled systems.

Jeff Epstein

unread,

Jun 20, 2013, 9:25:50 PM6/20/13

to reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com

On 6/20/13, Reid Draper <reidd...@gmail.com> wrote:
> Any chance that native support for Static would tighten these requirements?
>
> For many of the problems I'm interested in, having to have the same binary
> on all nodes would be a deal-breaker. I think the same would be true of
> anyone wanting to build loosely-coupled systems.

I think a lot of people feel the same way. A proper native Static
would have to take that into account. As far as I know, no one has put
forth a real proposal, as the "improvised Static" has so far been
adequate, albeit somewhat clumsy.

Jeff

Simon Peyton-Jones

unread,

Jun 21, 2013, 5:43:12 AM6/21/13

to reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com

For many of the problems I'm interested in, having to have the same binary on all nodes would be a deal-breaker. I think the same would be true of anyone wanting to build loosely-coupled systems.

Many, many issues hide here. If you want to send a code pointer (and we do – that is what the static thing is all about), you can

A. Send a code pointer (cheap). But this means that the same code has to live at the other end, so you can talk about it by name; and that’s precisely what you don’t want.

B. Send the code itself. And the code that it refers to, and so on, transitively.

C. A combination of (A) and (B). Start with (B), but cache the code at the remote end, so you don’t send the same code twice.

Under a vision of (C) you’d start with all-empty nodes, except one. When it spawns processes elsewhere, it must send the code (ALL the code, transitively). But the remote nodes cache it. Over time they all get all the code (well, all the code they need). World peace breaks out.

But it’s complicated to implement this vision:

· What do we mean by “same code”? Probably a fingerprint of the transitive closure of the code. Really the same all the way to the leaves.

· How do we ship code? Haskell source? Bytecode? Object code? Core lambda code?

In the end I think this is the Right Thing. But it’ll be some work to get there.

Simon

Tim Watson

unread,

Jun 21, 2013, 5:52:56 AM6/21/13

to sim...@microsoft.com, reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com

On 21 Jun 2013, at 10:43, Simon Peyton-Jones wrote:

But it’s complicated to implement this vision:
· What do we mean by “same code”? Probably a fingerprint of the transitive closure of the code. Really the same all the way to the leaves.
· How do we ship code? Haskell source? Bytecode? Object code? Core lambda code?

Edsko has submitted a patch that adds support for hs-plugins to distributed-static (viz dynamic) which might be instructive here.

In the end I think this is the Right Thing. But it’ll be some work to get there.

In the meanwhile, you can safely exchange 'messages' between nodes, given the caveats Jeff mentioned previously. With that capability in mind, it's possible to run different services on each node, providing they all "speak the same types" as it were.

Cheers,

Tim

Jost Berthold

unread,

Jun 21, 2013, 6:01:23 AM6/21/13

to sim...@microsoft.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com

On 06/21/2013 11:43 AM, Simon Peyton-Jones wrote:
> For many of the problems I'm interested in, having to have the same
> binary on all nodes would be a deal-breaker. I think the same would be
> true of anyone wanting to build loosely-coupled systems.
>
> Many, many issues hide here. If you want to send a code pointer (and we

> do ï¿½ that is what the static thing is all about), you can
>
> A.Send a code pointer (cheap). But this means that the same code has to
> live at the other end, so you can talk about it by name; and thatï¿½s
> precisely what you donï¿½t want.

Probably one should point to some ideas regarding this option A.
In the context of parallel Haskell implementations on distributed-memory
platforms (GUM, Eden implementation), the exact same mechanism is
required in the runtime, and I have done some prototyping towards
exposing it as an API to Haskell.
Jost Berthold. Orthogonal Serialisation for Haskell.
In: IFL 2010, LNCS 6647
http://www.diku.dk/~berthold/papers/mainIFL10-withCopyright.pdf
Whether you send a code pointer or a name that maps to one is a detail;
my understanding is that Cloud Haskell's remote table is essentially a
mapping from the latter to the former.

> B.Send the code itself. And the code that it refers to, and so on,
> transitively.
>
> C.A combination of (A) and (B). Start with (B), but cache the code at
> the remote end, so you donï¿½t send the same code twice.

Getting to B would be great.

> Under a vision of (C) youï¿½d start with all-empty nodes, except one.

> When it spawns processes elsewhere, it must send the code (ALL the code,
> transitively). But the remote nodes cache it. Over time they all get
> all the code (well, all the code they need). World peace breaks out.

C is even harder: In order to really benefit from caching, a sending
node would need information about the cache of _another_ (i.e. the
receiving) nodes to avoid sending existing code. One could certainly
construct a case where code will be received from two sending nodes
because they cannot possibly have this information (no matter how they
exchange it, data could be "in flight").

That said, it would definitely be very exciting to get such a
serialisation support, and be it at the price of code duplication
(transfers should be limited anyway).
People have experimented with "mobile Haskell" earlier, sending
bytecode, certainly worthwhile to look at the publications.

/ Jost Berthold

Reid Draper

unread,

Jun 21, 2013, 10:32:29 AM6/21/13

to Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

On Jun 21, 2013, at 5:43 AM, Simon Peyton-Jones <sim...@microsoft.com> wrote:

For many of the problems I'm interested in, having to have the same binary on all nodes would be a deal-breaker. I think the same would be true of anyone wanting to build loosely-coupled systems.

Many, many issues hide here. If you want to send a code pointer (and we do – that is what the static thing is all about), you can
A.      Send a code pointer (cheap). But this means that the same code has to live at the other end, so you can talk about it by name; and that’s precisely what you don’t want.
B.      Send the code itself. And the code that it refers to, and so on, transitively.
C.      A combination of (A) and (B). Start with (B), but cache the code at the remote end, so you don’t send the same code twice.

Under a vision of (C) you’d start with all-empty nodes, except one. When it spawns processes elsewhere, it must send the code (ALL the code, transitively). But the remote nodes cache it. Over time they all get all the code (well, all the code they need). World peace breaks out.

But it’s complicated to implement this vision:
·         What do we mean by “same code”? Probably a fingerprint of the transitive closure of the code. Really the same all the way to the leaves.
·         How do we ship code? Haskell source? Bytecode? Object code? Core lambda code?

In the end I think this is the Right Thing. But it’ll be some work to get there.

Absolutely. I should have mentioned this in my previous post, but sending functions from node to node also implies a tight coupling, for all the reasons you mentioned. Erlang does not get around this either [1]. I work on two large distributed Erlang applications (Riak and Riak CS), and (for the most part) we never send functions around. This gives our users the opportunity to upgrade each node individually. All of this being said, there certainly are applications where the advantages you get from this tighter coupling are worth the downsides. Any application where it's ok to stop the whole cluster, reload code, and then restart the cluster come to mind. In the end, I think there are lots of useful applications that can be build with Cloud Haskell just with message (no functions in them) passing. I'm not suggesting any changes, just trying to make a point that there are strong use-cases for just wanting some set of base types in common between nodes.

[1] http://www.javalimit.com/2010/05/passing-funs-to-other-erlang-nodes.html

Reid

Tim Watson

unread,

Jun 21, 2013, 11:04:15 AM6/21/13

to reidd...@gmail.com, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

Hi Reid,

On 21 Jun 2013, at 15:32, Reid Draper wrote:

On Jun 21, 2013, at 5:43 AM, Simon Peyton-Jones <sim...@microsoft.com> wrote:

In the end I think this is the Right Thing. But it’ll be some work to get there.

Absolutely. I should have mentioned this in my previous post, but sending functions from node to node also implies a tight coupling, for all the reasons you mentioned.

Indeed.

Erlang does not get around this either [1].

I work on two large distributed Erlang applications (Riak and Riak CS), and (for the most part) we never send functions around.

Hey - nice to see another Erlang programmer around here! I'd be well interested to get some feedback on ManagedProcess and Supervisor APIs - and Service.Registry (i.e., our version of gproc) when it's finished - from someone who's used their progenitors... See the "development" and "procreg" branches of https://github.com/haskell-distributed/distributed-process-platform/.

All of this being said, there certainly are applications where the advantages you get from this tighter coupling are worth the downsides.

Sure - distributed work queues like Celery, for example, are easier to manage if you can ship "jobs" around the cluster on an ad-hoc basis.

In the end, I think there are lots of useful applications that can be build with Cloud Haskell just with message (no functions in them) passing.

+1. The combination of cloud haskell and the other concurrency features of the language makes for a very powerful (and somewhat heady) mix.

I'm not suggesting any changes, just trying to make a point that there are strong use-cases for just wanting some set of base types in common between nodes.

And I think - if I've understood all the commentary over the past months correctly - that doing so will remain safe, providing the nodes share the same definition of those types. Remember, our comparison is based on the fingerprint, and providing the Typeable and Binary instances are sound, de-serialisation and coercing to the correct type(s) should work just fine.

Cheers,

Tim

Reid Draper

unread,

Jun 21, 2013, 1:08:06 PM6/21/13

to Tim Watson, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

On Jun 21, 2013, at 11:04 AM, Tim Watson <watson....@gmail.com> wrote:

Hey - nice to see another Erlang programmer around here! I'd be well interested to get some feedback on ManagedProcess and Supervisor APIs - and Service.Registry (i.e., our version of gproc) when it's finished - from someone who's used their progenitors... See the "development" and "procreg" branches of https://github.com/haskell-distributed/distributed-process-platform/.

Sure. I'd love to find a way to get involved.

Reid

Tim Watson

unread,

Jun 21, 2013, 1:45:28 PM6/21/13

to Reid Draper, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

Cool. Have you visited our issue tracker at https://cloud-haskell.atlassian.net/ yet? There are *lots* of open bugs and I'm quite happy to share! ;)

I've also been tinkering with a node-registrar ch backend that works like epmd for node (name@host) based discovery. I've not had time to really get very far with it, but I think /that/ would be an instant hit for Cloud Haskell. I was planning on writing it at the network-transport (tcp) level (due to the aforementioned restrictions) but we could hack it with just message passing and publish a dual exe / library cabal package for the time being.

Shout if you've got questions. Anything you want to pick up from Jira, just let me know and I'll set up up with the relevant developer permissions for that project.

Cheers,

Tim

Duncan Coutts

unread,

Jun 27, 2013, 11:52:07 AM6/27/13

to reidd...@gmail.com, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

On Fri, 2013-06-21 at 10:32 -0400, Reid Draper wrote:
> On Jun 21, 2013, at 5:43 AM, Simon Peyton-Jones
> <sim...@microsoft.com> wrote:

> > But it’s complicated to implement this vision:
> > · What do we mean by “same code”? Probably a fingerprint of
> the transitive closure of the code. Really the same all the way to
> the leaves.
> > · How do we ship code? Haskell source? Bytecode? Object
> code? Core lambda code?
> >
> > In the end I think this is the Right Thing. But it’ll be some work
> to get there.
>
>
> Absolutely. I should have mentioned this in my previous post, but
> sending functions from node to node also implies a tight coupling, for
> all the reasons you mentioned.

Right exactly, it's all about whether we're talking about uses cases
that involve loose coupling or tight coupling.

Cloud Haskell as originally envisioned is really for the tight coupling
use cases, where you're running one program but making use of all the
resources of many machines (much like how we write parallel/concurrent
code to run on multiple cores, but with a programming model that works
for the distributed memory and failure modes).

The loose coupling use cases are already served reasonably well by
existing standard protocols like http etc.

That said, it's not unreasonable to want to use this message passing
programming style with more loosely coupled systems. It's an
intermediate use case between a very loosely coupled design with an
externally defined protocol (where you could implement it using many
languages and frameworks, e.g. like http or the message queuing
protocols) and the tight coupling use case of running one program on
many nodes.

But for that use case with intermediate coupling we need to make sure
we're clear about what the requirements are, and to realise those
requirements are a bit different from the existing main use case of
single binary running on many machines.

In particular, it sounds like we're really talking about two special
cases and that we never need the hard general case that Simon is
worrying about. One special case is running the exact same binary (or at
least the "same" program) and sending both simple data messages and
closures. Another special case is running different programs that
communicate by passing messages of shared types (ie the types are the
"same" between the programs but the code is different) and never send
closures. The hard general case is different programs communicating both
data messages and closures. That's hard because of the need to send code
etc.

But it sounds to me like in practice we don't need that general case. So
we might be able to have a design where you can only send closures if
the function tables match (ie it's the "same" binary, for some
definition of sameness). On the other hand we might need some extra
mechanism to check that two endpoints are talking about the same shared
type. The typeable fingerprint may be too strong, since I think it
incorporates the package version. Or the failure case where the
programmer intends the types to match but in fact they do not may be
just too annoying & tricky to track down, since the default behaviour
would be just not receiving the messages that were sent.

So it's these kinds of issues that we would have to think about if we
want to extend the system to cover the looser coupling use cases.

Duncan

Carter Schonwald

unread,

Jun 30, 2013, 3:48:34 AM6/30/13

to dun...@well-typed.com, reidd...@gmail.com, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

Theres also another point in the design space that seems to not be mentioned:

for a given application that "needs to send closures", those closures likely constitute a closed DSL with a *Deep* embedding, and thus a serialized AST that can be interpreted by the recipient is actually perfectly valid! Considering that in many use cases network latency is likely a greater bottleneck source than running a tiny interpretor over a DSL with a first order AST rather than using compiled code! Such an approach also SOLVES the tight coupling of binary versions issue!

with respect to the matter of when you want to send a code fragment over the network and its execution is performance sensitive, I've some ideas / thoughts on how to support that which tie into some related Ideas i have for doing runtime code generation and exposing the resulting code as normal haskell values at runtime, without needing to patch the GHC RTS at all!

NB: I'm still a month or two away from having the time to start doing the basic runtime code gen experiments, but my current experiments that relate to this make me pretty optimistic.

cheers

-Carter

Carter Schonwald

unread,

Jun 30, 2013, 3:50:25 AM6/30/13

to dun...@well-typed.com, reidd...@gmail.com, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

phrased differently: the current scheme of "agreeing" about certain function pointers ahead of time (namely by way of some template haskell generating their names and related hackery), is really also a funny way of obliquely specifying "heres an EDSL with certain primops we all promise to have". So would it not be worth making that perspective explicit?

Tim Watson

unread,

Jul 1, 2013, 5:03:56 AM7/1/13

to carter.s...@gmail.com, dun...@well-typed.com, reidd...@gmail.com, Simon Peyton-Jones, parallel...@googlegroups.com, jer...@jeremyhuffman.com

I'm +1 on this, conceptually, at the very least as an option. Since we're due to add hs-plugins support to distributed-static in the next release, this (below) feels like another useful option to have.

Simon Peyton-Jones

unread,

Jul 1, 2013, 11:02:12 AM7/1/13

to Carter Schonwald, dun...@well-typed.com, reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com, Simon Peyton-Jones

for a given application that "needs to send closures", those closures likely constitute a closed DSL with a *Deep* embedding, and thus a serialized AST that can be interpreted by the recipient is actually perfectly valid! Considering that in many use cases network latency is likely a greater bottleneck source than running a tiny interpretor over a DSL with a first order AST rather than using compiled code! Such an approach also SOLVES the tight coupling of binary versions issue!

I believe that your point is this. Rather than send a function “foo”, enumerate all the functions you want to send in a data type

data Fun = Foo | Bar Int | Woz Bool

and then interpret them at the far end

interpret :: Fun -> IO ()

interpret Foo = foo

interpret (Bar i) = bar i

interpret (Woz b) = woz b

This is fine when there is a fixed finite set of functions, and for many applications that may be the case. It amounts to manual defunctionalisation; perhaps inconvenient but no more than that.

The time it *isn’t* the case is when you are spawning a process on a remote node. Then you need to give the function to run remotely, and at this point you are talking to the underlying infrastructure so you can’t have a per-application data type.

An alternative would be to somehow specify a process to be run, automatically, on every node. Then ‘spawn’ would be realised as sending a message to that process. Since you get to specify the process, it can interpret a per-application data type of all the processes you want to run. It’s arguable that this should be THE way that CH supports spawning, so as to give you maximal control. (For example the recipient process could decide that load was too heavy on that node, and forward on to another.) Letting the application decide policy while the infrastructure provides mechanism seems like a good plan.

Simon

From: Carter Schonwald [mailto:carter.s...@gmail.com]
Sent: 30 June 2013 08:49
To: dun...@well-typed.com
Cc: reidd...@gmail.com; Simon Peyton-Jones; parallel...@googlegroups.com; jer...@jeremyhuffman.com
Subject: Re: Each node running the same executable?

Theres also another point in the design space that seems to not be mentioned:

Carter Schonwald

unread,

Jul 3, 2013, 6:15:33 PM7/3/13

to Simon Peyton-Jones, parallel...@googlegroups.com

Simon, Could you explain the latter position more?

I don't understand how thats different from having a suitable "skeleton" / baseline EDSL for the alluded to things (and i'm not sure what those alluded to services may or may not be that you're referring to).

Thanks!

-Carter

Tim Watson

unread,

Jul 4, 2013, 3:33:53 AM7/4/13

to sim...@microsoft.com, Carter Schonwald, dun...@well-typed.com, reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com, Simon Peyton-Jones

On 1 Jul 2013, at 16:02, Simon Peyton-Jones <sim...@microsoft.com> wrote:

I believe that your point is this. Rather than send a function “foo”, enumerate all the functions you want to send in a data type

          data Fun = Foo | Bar Int | Woz Bool

and then interpret them at the far end

          interpret :: Fun -> IO ()

          interpret Foo = foo

          interpret (Bar i) = bar i

          interpret (Woz b) = woz b

This is fine when there is a fixed finite set of functions, and for many applications that may be the case. It amounts to manual defunctionalisation; perhaps inconvenient but no more than that.

It's worth noting that this is basically what a gen server (i.e., manage process) does, though without the single data type restriction, and with explicit support for rpc or throw-away (cast) interactions.

An alternative would be to somehow specify a process to be run, automatically, on every node. Then ‘spawn’ would be realised as sending a message to that process. Since you get to specify the process, it can interpret a per-application data type of all the processes you want to run. It’s arguable that this should be THE way that CH supports spawning, so as to give you maximal control.

It's quite a heavy restriction for the general case though. And you l'd have to write the interpreting loop each time, which is error prone as well as tedious. Plus consumers (wanting to spawn) wouldn't have any semantic guarantees, so unless they were written by the same person, chaos could ensue. :)

(For example the recipient process could decide that load was too heavy on that node, and forward on to another.) Letting the application decide policy while the infrastructure provides mechanism seems like a good plan.

I agree with that tenet, but was thinking of approaching it differently. Managed processes already take a declarative (policy based) approach to individual server processes with regards unexpected traffic and server side error handling. Supervisors move responsibility for error handling and recovery further up the chain. The service API, which I'm working on now, encodes other policies such as service interdependency, QoS, addressing and so on, into a framework that takes the drudgery out of wiring supervisor hierarchies and ensuring service components start in the right order, are available at the right time and can be located easily across nodes. In order to manage services across nodes, another API (currently dubbed Service.Execution) will provide pre-packaged tools for load regulation and traffic shaping,

Tim Watson

unread,

Jul 4, 2013, 3:34:30 AM7/4/13

to Tim Watson, sim...@microsoft.com, Carter Schonwald, dun...@well-typed.com, reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com

Sent from my iPhone.

Tim Watson

unread,

Jul 4, 2013, 3:40:57 AM7/4/13

to Tim Watson, sim...@microsoft.com, Carter Schonwald, dun...@well-typed.com, reidd...@gmail.com, parallel...@googlegroups.com, jer...@jeremyhuffman.com

Darned phone - sent too soon...more below

Sent from my iPhone.

On 4 Jul 2013, at 08:33, Tim Watson <watson....@gmail.com> wrote:

It's worth noting that this is basically what a gen server (i.e., manage process) does, though without the single data type restriction, and with explicit support for rpc or throw-away (cast) interactions.

An alternative would be to somehow specify a process to be run, automatically, on every node. Then ‘spawn’ would be realised as sending a message to that process. Since you get to specify the process, it can interpret a per-application data type of all the processes you want to run. It’s arguable that this should be THE way that CH supports spawning, so as to give you maximal control.

It's quite a heavy restriction for the general case though. And you l'd have to write the interpreting loop each time, which is error prone as well as tedious. Plus consumers (wanting to spawn) wouldn't have any semantic guarantees, so unless they were written by the same person, chaos could ensue. :)

(For example the recipient process could decide that load was too heavy on that node, and forward on to another.) Letting the application decide policy while the infrastructure provides mechanism seems like a good plan.

I agree with that tenet, but was thinking of approaching it differently. Managed processes already take a declarative (policy based) approach to individual server processes with regards unexpected traffic and server side error handling. Supervisors move responsibility for error handling and recovery further up the chain. The service API, which I'm working on now, encodes other policies such as service interdependency, QoS, addressing and so on, into a framework that takes the drudgery out of wiring supervisor hierarchies and ensuring service components start in the right order, are available at the right time and can be located easily across nodes. In order to manage services across nodes, another API (currently dubbed Service.Execution) will provide pre-packaged tools for load regulation and traffic shaping,

And choosing where to spawn tasks/services. Along with group services (like atomic broadcast and distributed transactions), this should give folks more than enough tooling to decide policy completely at the application level, to extend what the platform supports easily and we can avoid encoding too much information at the infrastructure level.

Now I dont know how Carter's stuff fits in there. I thought he was proposing an alternate infrastructure mechanism for shipping code, but I'll let him elaborate on that.

The only thing I'd point out about all this is that there's already a lot of moving parts in cloud Haskell. You've got to choose a network transport backend, then choose a ch backend to deal with topology/discovery/etc and then hopefully the service API will make it easy to wire Haskell code up to run where an when you want with lots if goodies to control the system at a minimum developer effort. If we introduce more constraints or glue code that users have to write, I think we might end up shooting ourselves in the foot.

Cheers,

Tim

Simon Peyton-Jones

unread,

Jul 4, 2013, 12:00:17 PM7/4/13

to Carter Schonwald, parallel...@googlegroups.com

I’m afraid I don’t understand the question. Maybe someone else can help or you can re-ask? I’m racing towards the POPL deadline at the moment.

S

Simon Marlow

unread,

Jul 5, 2013, 2:57:39 AM7/5/13

to sim...@microsoft.com, Carter Schonwald, parallel...@googlegroups.com

I think all that's being suggested is that rather than sending actual
object code you would send an AST and interpret it at the other end. At
its simplest the AST is a closure (function + arguments, which is what
we have now), but the AST could be much richer than this, and
application-specific, in principle. You could send a subset of HsSyn,
for example, so long as the Names are known at the far end - so you
agree on some baseline primitives, such as particular versions of the
core packages.

We do already have something of this flavour in the Static
implementation in distributed-process:
https://github.com/haskell-distributed/distributed-static/blob/master/src/Control/Distributed/Static.hs

Cheers,
Simon

On 04/07/13 17:00, Simon Peyton-Jones wrote:
> Iï¿½m afraid I donï¿½t understand the question. Maybe someone else can help
> or you can re-ask? Iï¿½m racing towards the POPL deadline at the moment.
>
> S
>
> *From:*Carter Schonwald [mailto:carter.s...@gmail.com]
> *Sent:* 03 July 2013 23:16
> *To:* Simon Peyton-Jones
> *Cc:* parallel...@googlegroups.com
> *Subject:* Re: Each node running the same executable?

>
> Simon, Could you explain the latter position more?
>
> I don't understand how thats different from having a suitable
> "skeleton" / baseline EDSL for the alluded to things (and i'm not sure
> what those alluded to services may or may not be that you're referring to).
>
> Thanks!
>
> -Carter
>
> On Mon, Jul 1, 2013 at 11:02 AM, Simon Peyton-Jones
> <sim...@microsoft.com <mailto:sim...@microsoft.com>> wrote:
>
> for a given application that "needs to send closures", those
> closures likely constitute a closed DSL with a *Deep* embedding, and
> thus a serialized AST that can be interpreted by the recipient is
> actually perfectly valid! Considering that in many use cases
> network latency is likely a greater bottleneck source than running
> a tiny interpretor over a DSL with a first order AST rather than
> using compiled code! Such an approach also SOLVES the tight coupling
> of binary versions issue!
>
> I believe that your point is this. Rather than send a function

> ï¿½fooï¿½, enumerate all the functions you want to send in a data type

>
> data Fun = Foo | Bar Int | Woz Bool
>
> and then interpret them at the far end
>
> interpret :: Fun -> IO ()
>
> interpret Foo = foo
>
> interpret (Bar i) = bar i
>
> interpret (Woz b) = woz b
>
> This is fine when there is a fixed finite set of functions, and for
> many applications that may be the case. It amounts to manual
> defunctionalisation; perhaps inconvenient but no more than that.
>

> The time it **isnï¿½t** the case is when you are spawning a process on

> a remote node. Then you need to give the function to run remotely,
> and at this point you are talking to the underlying infrastructure

> so you canï¿½t have a per-application data type.

>
> An alternative would be to somehow specify a process to be run,

> automatically, on every node. Then ï¿½spawnï¿½ would be realised as

> sending a message to that process. Since you get to specify the
> process, it can interpret a per-application data type of all the

> processes you want to run. Itï¿½s arguable that this should be THE

> way that CH supports spawning, so as to give you maximal control.
> (For example the recipient process could decide that load was too
> heavy on that node, and forward on to another.) Letting the
> application decide policy while the infrastructure provides
> mechanism seems like a good plan.
>
> Simon
>

> *From:*Carter Schonwald [mailto:carter.s...@gmail.com
> <mailto:carter.s...@gmail.com>]
> *Sent:* 30 June 2013 08:49
> *To:* dun...@well-typed.com <mailto:dun...@well-typed.com>
> *Cc:* reidd...@gmail.com <mailto:reidd...@gmail.com>; Simon
> Peyton-Jones; parallel...@googlegroups.com
> <mailto:parallel...@googlegroups.com>; jer...@jeremyhuffman.com
> <mailto:jer...@jeremyhuffman.com>
>
>
> *Subject:* Re: Each node running the same executable?

>
> Theres also another point in the design space that seems to not be
> mentioned:
>
> for a given application that "needs to send closures", those
> closures likely constitute a closed DSL with a *Deep* embedding, and
> thus a serialized AST that can be interpreted by the recipient is
> actually perfectly valid! Considering that in many use cases
> network latency is likely a greater bottleneck source than running
> a tiny interpretor over a DSL with a first order AST rather than
> using compiled code! Such an approach also SOLVES the tight coupling
> of binary versions issue!
>
> with respect to the matter of when you want to send a code fragment
> over the network and its execution is performance sensitive, I've
> some ideas / thoughts on how to support that which tie into some
> related Ideas i have for doing runtime code generation and exposing
> the resulting code as normal haskell values at runtime, without
> needing to patch the GHC RTS at all!
>
> NB: I'm still a month or two away from having the time to start
> doing the basic runtime code gen experiments, but my current
> experiments that relate to this make me pretty optimistic.
>
> cheers
>
> -Carter
>
> On Thu, Jun 27, 2013 at 11:52 AM, Duncan Coutts

> <duncan...@googlemail.com <mailto:duncan...@googlemail.com>>

> wrote:
>
> On Fri, 2013-06-21 at 10:32 -0400, Reid Draper wrote:
> > On Jun 21, 2013, at 5:43 AM, Simon Peyton-Jones

> > <sim...@microsoft.com <mailto:sim...@microsoft.com>> wrote:
>
> > > But itï¿½s complicated to implement this vision:
> > > ï¿½ What do we mean by ï¿½same codeï¿½? Probably a

> fingerprint of
> > the transitive closure of the code. Really the same all the way to
> > the leaves.

> > > ï¿½ How do we ship code? Haskell source? Bytecode? Object
> > code? Core lambda code?
> > >
> > > In the end I think this is the Right Thing. But itï¿½ll be some work

> <mailto:parallel-haskell%2Bunsu...@googlegroups.com>.

Carter Tazio Schonwald

unread,

Jul 9, 2013, 3:55:21 PM7/9/13

to Simon Marlow, Simon Peyton Jones, parallel...@googlegroups.com

Simon M is correctly translating what i'm saying proposing / suggesting. (thanks!)

some *significant* reasons why sending an AST rather than object code over the network is better include the following:

1) supporting heterogeneous computation (or at least a heterogeneous cluster)

2) it gives room for the nodes to have a "trust but verify" rather than "trust" model for code sent over the wire. Which could also help with catching bugs and such :)

On Fri, Jul 5, 2013 at 2:57 AM, Simon Marlow <marl...@gmail.com> wrote:

I think all that's being suggested is that rather than sending actual object code you would send an AST and interpret it at the other end. At its simplest the AST is a closure (function + arguments, which is what we have now), but the AST could be much richer than this, and application-specific, in principle. You could send a subset of HsSyn, for example, so long as the Names are known at the far end - so you agree on some baseline primitives, such as particular versions of the core packages.

We do already have something of this flavour in the Static implementation in distributed-process: https://github.com/haskell-distributed/distributed-static/blob/master/src/Control/Distributed/Static.hs

Cheers,
Simon

On 04/07/13 17:00, Simon Peyton-Jones wrote:

I’m afraid I don’t understand the question. Maybe someone else can help
or you can re-ask? I’m racing towards the POPL deadline at the moment.

S

*From:*Carter Schonwald [mailto:carter.schonwald@gmail.com]

*Sent:* 03 July 2013 23:16
*To:* Simon Peyton-Jones

*Cc:* parallel-haskell@googlegroups.com

*Subject:* Re: Each node running the same executable?

Simon, Could you explain the latter position more?

I don't understand how thats different from having a suitable
"skeleton" / baseline EDSL for the alluded to things (and i'm not sure
what those alluded to services may or may not be that you're referring to).

Thanks!

-Carter

On Mon, Jul 1, 2013 at 11:02 AM, Simon Peyton-Jones

<sim...@microsoft.com <mailto:sim...@microsoft.com>> wrote:

for a given application that "needs to send closures", those
closures likely constitute a closed DSL with a *Deep* embedding, and
thus a serialized AST that can be interpreted by the recipient is
actually perfectly valid! Considering that in many use cases
network latency is likely a greater bottleneck source than running
a tiny interpretor over a DSL with a first order AST rather than
using compiled code! Such an approach also SOLVES the tight coupling
of binary versions issue!

I believe that your point is this. Rather than send a function

“foo”, enumerate all the functions you want to send in a data type

data Fun = Foo | Bar Int | Woz Bool

and then interpret them at the far end

interpret :: Fun -> IO ()

interpret Foo = foo

interpret (Bar i) = bar i

interpret (Woz b) = woz b

This is fine when there is a fixed finite set of functions, and for
many applications that may be the case. It amounts to manual
defunctionalisation; perhaps inconvenient but no more than that.

The time it **isn’t** the case is when you are spawning a process on

a remote node. Then you need to give the function to run remotely,
and at this point you are talking to the underlying infrastructure

so you can’t have a per-application data type.

An alternative would be to somehow specify a process to be run,

automatically, on every node. Then ‘spawn’ would be realised as

sending a message to that process. Since you get to specify the
process, it can interpret a per-application data type of all the

processes you want to run. It’s arguable that this should be THE

way that CH supports spawning, so as to give you maximal control.
(For example the recipient process could decide that load was too
heavy on that node, and forward on to another.) Letting the
application decide policy while the infrastructure provides
mechanism seems like a good plan.

Simon

*From:*Carter Schonwald [mailto:carter.schonwald@gmail.com
<mailto:carter.schonwald@gmail.com>]

*Sent:* 30 June 2013 08:49
*To:* dun...@well-typed.com <mailto:dun...@well-typed.com>
*Cc:* reidd...@gmail.com <mailto:reidd...@gmail.com>; Simon

Peyton-Jones; parallel-haskell@googlegroups.com
<mailto:parallel-haskell@googlegroups.com>; jer...@jeremyhuffman.com
<mailto:jeremy@jeremyhuffman.com>

*Subject:* Re: Each node running the same executable?

Theres also another point in the design space that seems to not be
mentioned:

for a given application that "needs to send closures", those
closures likely constitute a closed DSL with a *Deep* embedding, and
thus a serialized AST that can be interpreted by the recipient is
actually perfectly valid! Considering that in many use cases
network latency is likely a greater bottleneck source than running
a tiny interpretor over a DSL with a first order AST rather than
using compiled code! Such an approach also SOLVES the tight coupling
of binary versions issue!

with respect to the matter of when you want to send a code fragment
over the network and its execution is performance sensitive, I've
some ideas / thoughts on how to support that which tie into some
related Ideas i have for doing runtime code generation and exposing
the resulting code as normal haskell values at runtime, without
needing to patch the GHC RTS at all!

NB: I'm still a month or two away from having the time to start
doing the basic runtime code gen experiments, but my current
experiments that relate to this make me pretty optimistic.

cheers

-Carter

On Thu, Jun 27, 2013 at 11:52 AM, Duncan Coutts

<duncan...@googlemail.com <mailto:duncan.coutts@googlemail.com>>

wrote:

On Fri, 2013-06-21 at 10:32 -0400, Reid Draper wrote:
> On Jun 21, 2013, at 5:43 AM, Simon Peyton-Jones

> <sim...@microsoft.com <mailto:sim...@microsoft.com>> wrote:

> > But it’s complicated to implement this vision:
> > · What do we mean by “same code”? Probably a

fingerprint of
> the transitive closure of the code. Really the same all the way to
> the leaves.

> > · How do we ship code? Haskell source? Bytecode? Object
> code? Core lambda code?
> >
> > In the end I think this is the Right Thing. But it’ll be some work

send an email to parallel-haskell+unsubscribe@googlegroups.com
<mailto:parallel-haskell%2Bunsu...@googlegroups.com>.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "parallel-haskell" group.
To unsubscribe from this group and stop receiving emails from it, send

an email to parallel-haskell+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward