I can't speak for Jeff here but I believe his philosophy with
CloudHaskell was not to handle the job scheduling (discovering a new
node, getting to it, running the binary image).
It looks like CloudHaskell currently supports discovery of *nodes*
(cfgPeerDiscoveryPort) but requires a static list of *hosts*
(cfgKnownHosts). It would definitely be nice to support dynamic
extension of that list.
As for tracking data blobs that have migrated.... well that definitely
seems like a concern for a higher-level programming layer to deal with
rather than CloudHaskell.
-Ryan
On Wed, Oct 12, 2011 at 1:48 PM, Alberto G. Corona <agoc...@gmail.com> wrote:
> I need to use haskell in an environment where different users add and remove
> dynamically nodes to/from the cloud.
> A question that seems not to be explicitly considered (there are too many
> three-letter achronimous here) is dynamic configuration. Allthoug I saw
> something in this direction in the description of CIEL.
> What to do for adding a new node to the cloud?. How to notice and make use
> of it?. How to distribute workload taking into account not only static but
> dynamic conditions: for example a task is doing something with a blob of
> data in one machine but later it has to do it with a remote blob. It is
> better to migrate the process/state? it is better to spawn a subprocess? it
> is better to move te blob to the local machine instead? all this depend not
> only on static but on dynamic conditions: local CPU load, length of the
> blob, available bandwith. All these details ideally should not be specified
> in a high level, declarative, task oriented language runing over the
> services of cloud Haskell. The scheduler should not be hyper intelligent,
> it can just execute rules defined by the programmer.
>
>
Ryan, you're partially right. Nodes will be detected dynamically on
all available hosts. That is, you can, during execution, start new CH
instances on existing hosts, and the getPeers function will track new
instances. Hosts can be either enumerated statically in the config
file (via cfgKnownHosts) or discovered dynamically
(cfgPeerDiscoveryPort). Dynamic detection is based on UDP broadcast,
so it usually only works over local networks, which may or may not be
appropriate given your application and deployment. One could plug in a
new host to the network and CH would immediately start using it.
Alberto, the task layer has already been implemented. It does do job
scheduling, although currently in a naive round-robin way. I agree
that the scheduler should take into account CPU load, available
bandwidth, and other considerations, but I'm not sure the best way to
express these constraints. The CH scheduler is pretty modular (it's
all in Remote.Task.selectLocation), so it could be easily replaced
with well-known distributed scheduler, such as Quincy. Currently,
running tasks can't be migrated, so the framework needs to move blobs
to processes, rather than the other way around. The programmer may
provide "hints" to the scheduler if it is known in advance which blobs
will be needed by which process. The task layer also automatically
manages fault recovery, and caches intermediary results, thus you also
get checkpointing.
See section 2.4 of my thesis for more information on the task layer.
Peer discovery is covered in section 4.2.4.
http://www.cl.cam.ac.uk/~jee36/thesis.pdf
Jeff