Status of Azure support in Elasticluster

32 views
Skip to first unread message

Nicolas D

unread,
Jan 9, 2018, 8:25:49 AM1/9/18
to elasticluster

Hi all,


I am testing Elasticluster to create and manage cluster using Slurm in Microsoft Azure.

So far, I did not succeed to create any instances (still trying !). I have basically started from this example [http://c-square.github.io/2015-11-19-create-a-cluster-on-windows-azure/#ii-setup-elasticluster] but keep failing with different error in script azure_provider.py (I am using dev version 1.3.dev0).


However, when looking at available documentation and feedback regarding Azure support in elasticluster there is no much ressources available (or it is quite old).

What is the status of Azure support in elasticluster? Did someone manage to recently run elasticluster with Azure?


Regards,

Nicolas

Riccardo Murri

unread,
Jan 10, 2018, 6:28:05 AM1/10/18
to Nicolas D, elasticluster
Dear Nicolas,

> What is the status of Azure support in elasticluster? Did someone manage to
> recently run elasticluster with Azure?

As far as I can recall, this is the first time someone posts attempts at
getting clusters running on Azure (apart from the author of the Azure
backend code, years ago).

Indeed, I tried following the instructions at
the link you posted and they didn't work:

* the procedure for uploading management certificates has changed, the
new procedure is here: https://docs.microsoft.com/en-us/azure/azure-api-management-certs

* the Azure native provider in ElastiCluster has a few code problems
(`wait_timeout` being set to the wrong default, hard-codes names
"frontend" and "compute" for nodes, etc.)

* I didn't have much luck in using the LibCloud-based driver either: the
code seems to rely on a `list_key_pairs()` function which is not
implemented in the Azure driver...

These may all be relatively easy to fix, but I won't have time for this
in the next few days.

I am afraid I must say that Azure is currently not working in
ElastiCluster and that wouldn't be fixed soon, unless somebody has time
to contribute code or debugging.

Ciao,
R

--
Riccardo Murri / Email: riccard...@gmail.com / Tel.: +41 77 458 98 32

Nicolas Deladerriere

unread,
Jan 10, 2018, 8:26:09 AM1/10/18
to Riccardo Murri, elasticluster

Dear Ricardo,


Thanks for you feedback.

I basically got same issues and start debugging azure_provider.py before sending my message.

My initial tests consist in evaluating Elasticluster with Azure but unfortunately I will not have time for debugging.

However, I will keep looking at any elasticluster update.

 

Thanks for your help,

Regards,

Nicolas

DVD PS

unread,
Feb 12, 2018, 10:59:07 AM2/12/18
to elasticluster
Hi Ricardo, Nicolas,

I've encountered the mailing list after suffering everything that Ricardo mentioned here. I'm trying to use elasticluster to run some clusters for a research project using azure, but no luck so far. Anyway, it doesn't feel that bad to know I'm not alone :)

If I manage to get somewhere I will let you know.

Cheers,
David

Dave Steinkraus

unread,
Mar 15, 2018, 12:00:10 PM3/15/18
to elasticluster
Hi all,

Weighing in as the original author of the Azure provider for Elasticluster. At the time I wrote it (2015), the Azure APIs I was writing to were already on their way out, to be replaced by a whole new set, which are "declarative" - 'Syntax that lets you state "Here is what I intend to create" without having to write the sequence of programming commands to create it. The Resource Manager template is an example of declarative syntax. In the file, you define the properties for the infrastructure to deploy to Azure.' [https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-overview] Back then, there was no firm date for these new APIs to be released.

I'm sorry but not surprised that the old APIs have stopped working correctly. I haven't kept up with Azure, but I'm sure at this point the only way forward is to rewrite the provider using the current APIs. Unfortunately, very little of the old code is likely to be reusable. (On the upside, I'll bet the new APIs work much better and more simply than the old ones, which were never really stable.)

I have a lot of commitments right now, but I'm willing to put some time into scoping out a rewrite. That's all I can promise.

Regards,
Dave 

nicolas.deladerriere

unread,
Mar 15, 2018, 12:57:58 PM3/15/18
to Dave Steinkraus, elasticluster
Hi Dave,

Thanks for this update. In the context of my project, I currently moved to Azure Batch service which is quite easy to use and really fit my needs. However, I still keep an eye on cloud provider independent solution.

Regards,
Nicolas

-------- Message d'origine --------
De : Dave Steinkraus <steinkr...@gmail.com>
Date : 15/03/2018 17:00 (GMT+01:00)
À : elasticluster <elasti...@googlegroups.com>
Objet : [elasticluster] Re: Status of Azure support in Elasticluster

--
You received this message because you are subscribed to a topic in the Google Groups "elasticluster" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticluster/yp_YMJPF3u8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticluste...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dave Steinkraus

unread,
Mar 15, 2018, 3:10:32 PM3/15/18
to nicolas.deladerriere, elasticluster
Hi Nicolas, glad you found a good solution for your needs. - Dave

On Thu, Mar 15, 2018 at 10:54 AM, nicolas.deladerriere <nicolas.de...@gmail.com> wrote:
Hi Dave,

Thanks for this update. In the context of my project, I currently moved to Azure Batch service which is quite easy to use and really fit my needs. However, I still keep an eye on cloud provider independent solution.

Regards,
Nicolas

-------- Message d'origine --------
De : Dave Steinkraus <steinkr...@gmail.com>
Date : 15/03/2018 17:00 (GMT+01:00)
À : elasticluster <elasticluster@googlegroups.com>
Objet : [elasticluster] Re: Status of Azure support in Elasticluster

Hi all,

Weighing in as the original author of the Azure provider for Elasticluster. At the time I wrote it (2015), the Azure APIs I was writing to were already on their way out, to be replaced by a whole new set, which are "declarative" - 'Syntax that lets you state "Here is what I intend to create" without having to write the sequence of programming commands to create it. The Resource Manager template is an example of declarative syntax. In the file, you define the properties for the infrastructure to deploy to Azure.' [https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-overview] Back then, there was no firm date for these new APIs to be released.

I'm sorry but not surprised that the old APIs have stopped working correctly. I haven't kept up with Azure, but I'm sure at this point the only way forward is to rewrite the provider using the current APIs. Unfortunately, very little of the old code is likely to be reusable. (On the upside, I'll bet the new APIs work much better and more simply than the old ones, which were never really stable.)

I have a lot of commitments right now, but I'm willing to put some time into scoping out a rewrite. That's all I can promise.

Regards,
Dave 

On Tuesday, January 9, 2018 at 6:25:49 AM UTC-7, Nicolas D wrote:

Hi all,


I am testing Elasticluster to create and manage cluster using Slurm in Microsoft Azure.

So far, I did not succeed to create any instances (still trying !). I have basically started from this example [http://c-square.github.io/2015-11-19-create-a-cluster-on-windows-azure/#ii-setup-elasticluster] but keep failing with different error in script azure_provider.py (I am using dev version 1.3.dev0).


However, when looking at available documentation and feedback regarding Azure support in elasticluster there is no much ressources available (or it is quite old).

What is the status of Azure support in elasticluster? Did someone manage to recently run elasticluster with Azure?


Regards,

Nicolas

--
You received this message because you are subscribed to a topic in the Google Groups "elasticluster" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticluster/yp_YMJPF3u8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticluster+unsubscribe@googlegroups.com.

Riccardo Murri

unread,
Mar 15, 2018, 4:33:54 PM3/15/18
to Dave Steinkraus, elasticluster
Hello Dave,

glad to see you're still around, and thanks for chiming in!

I also came to the conclusion that the new APIs are much easier to handle
and it would be worth rewriting the support module using them.

Microsoft docs link to this example code, which (it looks to me)
covers all the basic
functionality needed by ElastiCluster:
https://github.com/Azure-Samples/virtual-machines-python-manage/blob/master/example.py

I have started a rewrite using that code as an example, but I have
trouble navigating
the API docs; perhaps you know more and can give me some hints or pointers to
the relevant documentation? In particular:

- the above example allocates a private IP address on a private network (VNet);
however ElastiCluster VMs would instead need to have a public IP.
This page seem to have all the required info, but only has procedures for
attaching IPs via the web interface or the CLI -- no API access.
https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-network-interface-addresses

- How does one specify SSH public keys to inject in the VM? This part
is missing
from the example.

Thanks for any help!

Riccardo

Dave Steinkraus

unread,
Mar 15, 2018, 5:58:42 PM3/15/18
to Riccardo Murri, elasticluster
Hi Riccardo. Seems like elasticluster is thriving under your leadership (based on the github notifications, which I still get).

As I said, I haven't kept up with Azure (been more involved with AWS) so I'm coming in knowing less than you about the new APIs.

I can look into your questions - a quick google didn't yield much. 

Best,
Dave

Riccardo Murri

unread,
Mar 17, 2018, 7:15:47 PM3/17/18
to elasticluster
Hi all,

I have finished drafting code for the new Azure backend. It is
available in PR https://github.com/gc3-uzh-ch/elasticluster/pull/530
(note: PRs are not available in the Docker image, you need to run from
the Python sources).

The PR web page has a few details on the current status of the code,
what works and what not (in short: starting works, but stopping does
not delete all resources -- be sure you check your Azure portal card
and not leave billable resources running!)

Also note: the configuration for Azure has changed!

An example configuration file with the new keys/format is available
at: https://github.com/riccardomurri/elasticluster/blob/42ad511df66aa8d14bad267c649cf41ba8aa53b6/examples/slurm-and-gridengine-on-azure-complete.conf

I'll be grateful for any report of success or errors!

Ciao,
R

Hatef Monajemi

unread,
Mar 17, 2018, 7:34:27 PM3/17/18
to Riccardo Murri, elasticluster
Great to hear this. I will test it now and let you know how it goes.

Best,
hatef

Riccardo Murri

unread,
Mar 20, 2018, 6:18:44 PM3/20/18
to elasticluster
Dear all,

I have just merged the new Azure provider into the "master" branch;
this means the Docker image will get working Azure support in a few
minutes.

The configuration information has changed; please see
http://elasticluster.readthedocs.io/en/latest/configure.html#valid-configuration-keys-for-azure
and especially http://elasticluster.readthedocs.io/en/latest/configure.html#valid-configuration-keys-for-azure
to know what needs to be there.
An example config is provided at:
https://github.com/gc3-uzh-ch/elasticluster/blob/master/examples/slurm-and-gridengine-on-azure-complete.conf

At least a couple of features are unsupported:

* security groups: Azure only allows 1 security group per VM; still,
ElastiCLuster currently ignores this parameter and creates a new
security group,
named like the cluster, initially containing rules to allow inbound
SSH. Any other rule must be added afterwards using the Azure portal or
any other management interface.

* ``image_userdata`` is currently *not* supported on Azure.

Please let me know if you run into problems with this new code.

Ciao,
R

Pablo Escobar

unread,
Mar 20, 2018, 6:48:29 PM3/20/18
to Riccardo Murri, elasticluster
Hi Riccardo,

Not sure if this suggestion makes much sense, I am not an expert in neither boto or terraform or elasticluster, but lately I have tested Terraform and it seems a really useful and powerful tool to boot cloud environments. After some quick tests I was really impressed about how easy to use and powerful it is .

Terraform has providers for AWS/GCE/OpenStack/Azure (and many others)

in a quick google search I have also found some python wrappers (I haven't tested them though)

Maybe for the future you want to evaluate to option of delegating to terraform the communication with cloud apis and rely in ansible only for configuration. Without being an expert in the topic IMHO it seems more powerful than boto

just an idea ;)

regards,
Pablo.

p.s. I haven't forgot about our project, it's just that I am crazy overloaded with other projects which should be "finished yesterday"





Ciao,
R

--
You received this message because you are subscribed to the Google Groups "elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticluster+unsubscribe@googlegroups.com.

nicolas.deladerriere

unread,
Mar 20, 2018, 7:05:58 PM3/20/18
to Riccardo Murri, elasticluster
Riccardo,

Thanks a lot for this update.
I don't know when, but I will hopefully give it a try.
Just one question (sorry I did not really had a look before asking) is it possible to allocate VM within an already defined VNet without defining public IP for created VM.

Regards,
Nicolas

-------- Message d'origine --------
De : Riccardo Murri <riccard...@gmail.com>
Date : 20/03/2018 23:18 (GMT+01:00)
À : elasticluster <elasti...@googlegroups.com>
Objet : Re: [elasticluster] Re: Status of Azure support in Elasticluster

--
You received this message because you are subscribed to a topic in the Google Groups "elasticluster" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticluster/yp_YMJPF3u8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticluste...@googlegroups.com.

Riccardo Murri

unread,
Mar 21, 2018, 4:02:29 PM3/21/18
to nicolas.deladerriere, elasticluster
Hello Nicolas,

> Just one question (sorry I did not really had a look before asking) is it
> possible to allocate VM within an already defined VNet without defining
> public IP for created VM.

No, not with the current code. Adding support for using an existing
VNet should be relatively straightforward, but Ansible needs a way to
connect to the target VM via SSH -- if there's no public IP, how can
it connect?

Kind regards,
Riccardo
Reply all
Reply to author
Forward
0 new messages