Publish Subscribe in a cluster

16 views
Skip to first unread message

Saurabh Rawat

unread,
Aug 17, 2016, 1:09:00 AM8/17/16
to Distributed Haskell
Hi,


I am new to Haskell. I use Akka in Scala world to take care of publish subscribe in a cluster.

What I am trying to achieve is to have processes in a cluster of haskell nodes able to call out to each other through logical names, without worrying about the physical location of those processes.
I can't seem to find out if it's possible with cloud haskell.

Is there also a mechanism to monitor processes and spawn new ones on nodes if a particular named entity doesn't exist anywhere in the cluster?

Please tell me how much of this is possible and if it's already a solved problem through some library. I so want to use haskell but this is a major hurdle.


Regards,

Saurabh Rawat

Simon Peyton Jones

unread,
Aug 17, 2016, 9:43:30 AM8/17/16
to Saurabh Rawat, Distributed Haskell

Sounds like you need Cloud Haskell:  http://haskell-distributed.github.io/

That’s Haskell’s version of Akka.

 

Simon

 

--
You received this message because you are subscribed to the Google Groups "Distributed Haskell" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distributed-has...@googlegroups.com.
To post to this group, send email to distribut...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/distributed-haskell/717230a1-6dfb-4f67-bca0-56dae35e578c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Saurabh Rawat

unread,
Aug 17, 2016, 10:40:54 AM8/17/16
to Simon Peyton Jones, Distributed Haskell
I can't believe I am getting a reply from Sir Simon Peyton Jones :P

I know about cloud Haskell. Let me make a comparison though. In Akka, I can have named entities distributed over a cluster of nodes. To communicate with them their name is all I need.
Cloud Haskell does provide an actor model. But there is no such functionality (cluster management, named lookups, starting missing entities on available nodes etc) on top of it.
I see this as a bare minimum to be able to write a horizontally scalable web app. It could well be that I only know how to do it this way :P

I posted here because I am fairly new to Haskell and I am desperately looking for something which will help me achieve this. I like Haskell so much (I even wrote a Cassandra driver so I won't have to give it up).
I am so sad now that after having done all that I am still away from being able to use it. It would not be pragmatic to give up Akka and not have the bare minimum for Haskell.


Regards,

Saurabh Rawat

On Wed, Aug 17, 2016 at 7:13 PM, Simon Peyton Jones <sim...@microsoft.com> wrote:

Sounds like you need Cloud Haskell:  http://haskell-distributed.github.io/

That’s Haskell’s version of Akka.

 

Simon

 

From: distributed-haskell@googlegroups.com [mailto:distributed-haskell@googlegroups.com] On Behalf Of Saurabh Rawat
Sent: 17 August 2016 06:09
To: Distributed Haskell <distributed-haskell@googlegroups.com>
Subject: Publish Subscribe in a cluster

 

Hi,

 

 

I am new to Haskell. I use Akka in Scala world to take care of publish subscribe in a cluster.

 

What I am trying to achieve is to have processes in a cluster of haskell nodes able to call out to each other through logical names, without worrying about the physical location of those processes.

I can't seem to find out if it's possible with cloud haskell.

 

Is there also a mechanism to monitor processes and spawn new ones on nodes if a particular named entity doesn't exist anywhere in the cluster?

 

Please tell me how much of this is possible and if it's already a solved problem through some library. I so want to use haskell but this is a major hurdle.

 

 

Regards,

 

Saurabh Rawat

--
You received this message because you are subscribed to the Google Groups "Distributed Haskell" group.

To unsubscribe from this group and stop receiving emails from it, send an email to distributed-haskell+unsub...@googlegroups.com.
To post to this group, send email to distributed-haskell@googlegroups.com.

Simon Peyton Jones

unread,
Aug 17, 2016, 11:19:05 AM8/17/16
to Saurabh Rawat, Distributed Haskell, Tim Watson, Boespflug, Mathieu, parallel...@googlegroups.com

OK now I’m out of my depth :-). 

 

I bet that Tim Watson and/or Mathieu Boespflug would know.  They are probably just on holiday or something.

 

Simon

 

To unsubscribe from this group and stop receiving emails from it, send an email to distributed-has...@googlegroups.com.
To post to this group, send email to distribut...@googlegroups.com.

Boespflug, Mathieu

unread,
Aug 17, 2016, 11:40:22 AM8/17/16
to Simon Peyton Jones, Saurabh Rawat, Distributed Haskell, Tim Watson, parallel...@googlegroups.com
Hi Saurabh,

>  I am so sad now that after having done all that I am still away from being able to use it.

Don't be so sad! What you are looking for is a so-called "registry". Something equivalent to Erlang's epmd service. Such a registry keeps track of what nodes are in the cluster, let's you connect to them, and assists in node discovery. But note that you can also do without it for simple deployments. Cloud Haskell has a modular architecture where you can pick and chose what you need. You can use distributed-process-simplelocalnet to have nodes auto discover each other even without a central registry, when on a single subnet. See this tutorial for an example of how to use it: 


There are more advanced solutions out there for more advanced use cases, but they are either too specific to particular deployment scenarios or still proprietary (or both).

At its core Cloud Haskell just gives you the basic programming model. It's up to you to pick and chose the deployment addons that fit well with the apps you wrote and where you want to deploy them (a local bare metal cluster? AWS? Azure?...). Your apps should care very much about the programming model, but be largely oblivious to deployment details. If nothing fits your particular use case well yet, the good news is it's really easy to contribute an add-on package that will do exactly what you want. You don't even need to modify distributed-process, the core Cloud Haskell package!

Even better if you can publish your package publicly. Others will be glad to reuse it for their deployments.

Best,

--
Mathieu Boespflug
Founder at http://tweag.io.

To unsubscribe from this group and stop receiving emails from it, send an email to distributed-haskell+unsub...@googlegroups.com.
To post to this group, send email to distributed-haskell@googlegroups.com.

 

--
You received this message because you are subscribed to the Google Groups "Distributed Haskell" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distributed-haskell+unsub...@googlegroups.com.
To post to this group, send email to distributed-haskell@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/distributed-haskell/BN6PR21MB0644FE2C013BD4C4AF65248DAD140%40BN6PR21MB0644.namprd21.prod.outlook.com.

Saurabh Rawat

unread,
Aug 17, 2016, 1:55:33 PM8/17/16
to Boespflug, Mathieu, Simon Peyton Jones, Distributed Haskell, Tim Watson, parallel...@googlegroups.com
Hi Mathieu,

Thanks for that info :)

If I understand it correctly, simplelocalnet gives me a way to get all my peers and send a named process which exists on all nodes (not necessarily) a message.

To have something like cluster sharded actors in akka, I will have to build a layer on top of this which takes care of spawning named processes and keeps a registry of their locations to route messages?

That sounds like a big project. I don't think I am qualified enough for that :P 
I wish someone in the haskell community had faced the same problem and they could come up with a solution. It seems such a common functionality to me!

Well maybe not for immediate projects then. The thought of going back is painful. I hope I can write this layer soon.

Regards,

Saurabh Rawat

On Wed, Aug 17, 2016 at 9:10 PM, Boespflug, Mathieu <m...@tweag.io> wrote:
Hi Saurabh,

>  I am so sad now that after having done all that I am still away from being able to use it.

Don't be so sad! What you are looking for is a so-called "registry". Something equivalent to Erlang's epmd service. Such a registry keeps track of what nodes are in the cluster, let's you connect to them, and assists in node discovery. But note that you can also do without it for simple deployments. Cloud Haskell has a modular architecture where you can pick and chose what you need. You can use distributed-process-simplelocalnet to have nodes auto discover each other even without a central registry, when on a single subnet. See this tutorial for an example of how to use it: 


There are more advanced solutions out there for more advanced use cases, but they are either too specific to particular deployment scenarios or still proprietary (or both).

At its core Cloud Haskell just gives you the basic programming model. It's up to you to pick and chose the deployment addons that fit well with the apps you wrote and where you want to deploy them (a local bare metal cluster? AWS? Azure?...). Your apps should care very much about the programming model, but be largely oblivious to deployment details. If nothing fits your particular use case well yet, the good news is it's really easy to contribute an add-on package that will do exactly what you want. You don't even need to modify distributed-process, the core Cloud Haskell package!

Even better if you can publish your package publicly. Others will be glad to reuse it for their deployments.

Best,

--
Mathieu Boespflug
Founder at http://tweag.io.

To unsubscribe from this group and stop receiving emails from it, send an email to distributed-haskell+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Distributed Haskell" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distributed-haskell+unsubscribe...@googlegroups.com.

To post to this group, send email to distributed-haskell@googlegroups.com.

Saurabh Rawat

unread,
Aug 18, 2016, 12:14:18 AM8/18/16
to Duncan Coutts, Distributed Haskell, Simon Peyton Jones, parallel...@googlegroups.com
Hi Duncan,

The way cloud haskell is structured is that there is a core library
distributed-process which provides the basics and then there are other
libraries distributed-process-* which provide additional functionality
on top.

What is the "bare minimum" for you is something I've never needed, or
at least not in the same way, and I wouldn't be surprised if what I
consider the essential abstractions are not the same ones you need.


Understood. What I meant was that "I wished, people more intelligent than me faced this problem then I wouldn't have to" :P
 
So have a look at the other distributed-process-* libraries and see if
they fit your needs at all, and if not don't be afraid to just
implement something. It's probably not that difficult.

For example in our application we have a simple "cluster management"
system which is just a module or two of code. It allows machines to
join and for other members to find out all current members or to be
notified of new ones joining.


I was under the impression that one should not role their own distributed anything. At least with Akka, I know it took quite some time to get
all the edge cases sorted out. But I understand that someone has to write it first.

Thanks for your time everyone. I will try to do it and pester you with more questions :)

Regards,

Saurabh Rawat
Reply all
Reply to author
Forward
0 new messages