Setting up AiiDA for multiple users in cloud services

101 views
Skip to first unread message

Jonathan Chico

unread,
Jul 18, 2022, 2:41:31 AM7/18/22
to aiidausers
Dear all

I was wondering about deploying AiiDA for multiple users in a cloud service (Azure). Currently I have about 4 users that make use of AiiDA to perform DFT and MD calculations.

What I have done until now is created a Ubuntu VM where each user has a conda installation with an aiida virtual environment. Inside each virtual environment the aiida packages that my user need are installed. The databases for each profile are handled via a global postgresql installation inside the VM.

While this works quite okay, I was wondering on which options one could have to improve the user experience and data safety. I'm having to update the VM that currently hosts our system so it seems like a good time to experiment. Besides provisioning AiiDA my cluster needs to deploy several other services, such as license servers, etc.

For what I have seen in the repos there are several options that could be used
  1. Using the ansible-playbook: This would help setting up an AiiDA lab server for multiple users. For what I understand the environment for each user is locked down, with only the admin having the rights to install packages. This system also deploys a postgresql service and rabbitmq.
  2. Using the kubernetes helm chart: This would set an AiiDA lab server in a kubernetes cluster.
  3. Deploying AiiDA using a docker container.
What I would like to know is how much freedom the users have in these options? In the docker container presumably having a jupyter environment for each user would probably not be possible.

In the AiiDA lab are they tied to the jupyter notebook or can they use a terminal to perform other operations too? Where is the repository stored? Can one configure this?

I was also wondering about the possibility of deploying a managed PSQL server inside Azure and connecting it to the AiiDA installations. This would allow us to have even more data redundancy. Even if this is supported by AiiDA, I was wondering if anyone has tested a similar setup. It might be interesting to perform some benchmarks. Also I wonder if it would be possible to adapt the previous solutions to use a PSQL server such as this one.

Thank you for your help

Cheers

Leopold Talirz

unread,
Jul 18, 2022, 6:28:36 AM7/18/22
to aiidausers
Hi Jonathan,

thanks for taking the time to re-post the question from our private discussion on the AiiDA mailing list so that others can profit from it as well.

I will repeat that using docker images to distribute a "managed" AiiDA environment to a set of users is a very promising rout in my opinion.
Currently, the AiiDA lab container is somewhat geared towards novice/non-expert users who would like to use the graphical user interface and won't need to make system-level changes to the environment (no sudo rights).
However, I suspect that the changes needed to adapt it to a more "expert" user base, or to the needs of a specific group/system are not many (and, it would be great if the AiiDA lab maintainers could document how to best go about this [1]).

I will try to answer some of your questions, and also point the current maintainers for the AiiDa lab to this thread.

 In the docker container presumably having a jupyter environment for each user would probably not be possible.
The python environment inside the container is bootstrapped from the conda environment in the image.
However, every user mounts their home folder as a persistent directory into the container.
If a user `pip install`s a package, it will go into their home directory and persist between container restarts.
This allows users to persistently modify their python/jupyter environments.
 
In the AiiDA lab are they tied to the jupyter notebook or can they use a terminal to perform other operations too?
Jupyter lab contains a terminal application as well, and users are free to use whatever interface they are more comfortable with.
The only "missing feature" compared to your previous setup will be for users to directly SSH into their containers from, say, their work station (rather than going through the browser).
This is also relevant for when users want to use their own IDEs like VSCode to work on source code inside the container (which typically would use the Remote SSH plugin).
Since this is a rather generic docker use case, I suspect there may be solutions already available that support this scenario (exposing SSH access to docker containers on a host without giving users access to the host itself), and if you need this feature we could have a look.

 
Where is the repository stored? Can one configure this?
The repository is also stored in the user's home folder.
It may certainly be possible to, say, add an additional directory mount for a central NFS file system and, by default, store the AiiDA file repositories there.

I was also wondering about the possibility of deploying a managed PSQL server inside Azure and connecting it to the AiiDA installations. This would allow us to have even more data redundancy. Even if this is supported by AiiDA, I was wondering if anyone has tested a similar setup. It might be interesting to perform some benchmarks.
As we discussed in private, I think this sounds like a good idea, and I would be interested in these benchmarks as well.
 
Also I wonder if it would be possible to adapt the previous solutions to use a PSQL server such as this one.
I see no major hurdles to, say, adapt this mode in the AiiDA lab.
The only point that comes to mind is that creating a new user will now require creating a new database on the database service.
Automating this from inside the JupyterHub may require some fiddling/extra coding, but as a practical workaround you can also just pre-create those databases.

Finally, for those reading along I should mention that Jonathan and I decided to work on a template for setting up a multi-user AiiDA deployment on Azure.
We will make it public once we're happy with it - if anyone is interested in contributing / providing feedback, please send me an email and I will be happily to add you to the repository.

Best wishes,
Leopold

Jonathan Chico

unread,
Jul 19, 2022, 4:09:20 AM7/19/22
to aiida...@googlegroups.com, Leopold Talirz


Jupyter lab contains a terminal application as well, and users are free to use whatever interface they are more comfortable with.
The only "missing feature" compared to your previous setup will be for users to directly SSH into their containers from, say, their work station (rather than going through the browser).
This is also relevant for when users want to use their own IDEs like VSCode to work on source code inside the container (which typically would use the Remote SSH plugin).
Since this is a rather generic docker use case, I suspect there may be solutions already available that support this scenario (exposing SSH access to docker containers on a host without giving users access to the host itself), and if you need this feature we could have a look.

Yes I think that this would be ideal, since depending on the user having the capability yo just ssh to the machine might be preferable. It seems to be quite possible but I'm unsure on security wise what is the best thing to do.

I see no major hurdles to, say, adapt this mode in the AiiDA lab.
The only point that comes to mind is that creating a new user will now require creating a new database on the database service.
Automating this from inside the JupyterHub may require some fiddling/extra coding, but as a practical workaround you can also just pre-create those databases

I think this should be quite doable, probably something that can be easily automated if one has the right credentials.

The only other thing I see is that one would have to deploy using OAuth2 for security reasons, it should be simple according to the documentation.

Cheers


--
AiiDA is supported by the NCCR MARVEL (http://nccr-marvel.ch/), funded by the Swiss National Science Foundation, and by the European H2020 MaX Centre of Excellence (http://www.max-centre.eu/).
 
Before posting your first question, please see the posting guidelines at http://www.aiida.net/?page_id=356 .
---
You received this message because you are subscribed to a topic in the Google Groups "aiidausers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aiidausers/gNA_2OUtfuc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aiidausers+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aiidausers/aeabf11f-07f8-422a-8186-c1e1ea0e36acn%40googlegroups.com.


--
Jonathan Chico

Carl Simon Adorf

unread,
Jul 19, 2022, 12:55:30 PM7/19/22
to aiidausers
Hi Jonathan,

Thank you very much for reaching out and providing such a detailed account of your requirements.

I agree with Leopold that using containers is likely the optimal solution to provision a controlled user environment and would further argue that by deploying those environments via JupyterHub, many of the aforementioned challenges regarding user and environment management would be resolved.

Like Leopold said, while we want to ensure that the image is compatible with a standard Jupyterlab environment in the end it is still just a container with a shell that can in theory be SSHed into as long as it is exposed on the network (either directly or via proxy). It seems like there are already some early-stage efforts to implement direct SSH access to user sessions in the context of JupyterHub which would be worth a try IMO: https://github.com/yuvipanda/jupyterhub-ssh

I think that the database management, i.e., automatically creating new users in the database could be achieved with a custom spawner implementation: https://jupyterhub.readthedocs.io/en/stable/reference/spawners.html or simply as part of the startup scripts.

Since you mentioned that you use conda environments to manage the individual user environments, maybe we can use conda-store to manage and distribute those: https://conda-store.readthedocs.io/en/latest/ .

Concerning next steps, setting up another multi-user deployment on a single server is currently in my own backlog. This is therefore a good time to address many of these issues since I need to work on them anyways. If you like, you could start by trying to create an AiiDAlab server using the ansible playbook: https://github.com/aiidalab/ansible-playbook-aiidalab-server and report on any issues that you encounter. In the meantime, I will work on finishing up the effort of decoupling the services from the image: https://github.com/aiidalab/aiidalab-docker-stack/issues/259

We can keep coordinating efforts either directly here on the mailing list or in the form of issues on the repository or both.

Best,
Simon

Jonathan Chico

unread,
Jul 25, 2022, 6:25:50 AM7/25/22
to aiida...@googlegroups.com
Hi Carl Simon!


Thanks for the recommendation! I have indeed tried to deploy the ansible server in Azure using two types of VMs, Ubuntu 18.04 and Ubuntu 20.04. Unfortunately I ran into problems with both of them and I was not able to deploy the service. I opened an issue in the github maybe we can discuss there in more detail.

Thanks!

Cheers



--
Jonathan Chico
Reply all
Reply to author
Forward
0 new messages