GPU implementation of aenet

123 views
Skip to first unread message

Daniel Massote

unread,
Aug 19, 2021, 1:14:26 PM8/19/21
to aenet

Dear Nong and collaborators,

I've been using aenet (no publications yet) and I noticed that the two main bottlenecks of the whole process of making ML potentials are dataset generation and model training.

Regarding the model training, is there any initiative in trying to implement aenet code to plataforms such as pytorch? Is there any documentation (aenet's doc) in which I could try to work on this matter?

Thanks for the great work so far. Also, congrats on the Lammps implementation. It is really fast.

Best,
Daniel

Artrith, N. (Nong)

unread,
Aug 19, 2021, 1:39:33 PM8/19/21
to Daniel Massote, aenet, Artrith, N. (Nong)
Dear Daniel,

Happy to hear from you, and happy to hear that you can work well with the aenet-lammps --> many thanks to M.S. Chen, T. Morawietz, H. Mori, T.E. Markland for their contribution!.

Yes, indeed you are right, the bottleneck are dataset generation and model training.  As we know: no database, no machine learning.

Regarding the GPU/Pytorch implementation, we have not done it yet (no manpower)  --- I am looking for PhD students who can help with this [1].  We plan to do this in the future since as you also mentioned there is a nice GPU implementation working so well with Pytorch.   If we start to work on it, we would be happy to involve your team if you would be interested in.

Take care and stay healthy.

Best wishes,
Nong

[1] https://www.uu.nl/en/organisation/working-at-utrecht-university/jobs/2-phd-positions-in-atomistic-modelling-and-machine-learning-for-interfaces-in-energy-materials-10


=========================
Nongnuch Artrith, 
Assistant Professor
Debye Institute for Nanomaterials Science, Utrecht University
David de Wiedgebouw, Office 4.66, Universiteitsweg 99, 3584 CG Utrecht
Email: n.ar...@uu.nl
Phone: +31-6-28-15-97-99
Web: https://www.uu.nl/staff/NArtrith



From: ae...@googlegroups.com <ae...@googlegroups.com> on behalf of Daniel Massote <dvpma...@gmail.com>
Sent: Thursday, August 19, 2021 7:14 PM
To: aenet <ae...@googlegroups.com>
Subject: [aenet] GPU implementation of aenet
 

Daniel Massote

unread,
Aug 19, 2021, 1:50:05 PM8/19/21
to aenet
Thanks for the quick reply. I am the group of my own, though I have some collaborators with strong python knowledge. If you let me, I will be in touch in a week or so.

Best,
Daniel

Emine Kucukbenli

unread,
Aug 19, 2021, 1:55:58 PM8/19/21
to Daniel Massote, aenet, Emine Kucukbenli, Emine Kucukbenli
Dear Daniel, Nong, and AEnet developers; 

Let me take this opportunity to offer publicly the following, something I had in mind for a while now:

PANNA, the TensorFlow based code our group has been developing, which is also interfaced with Lammps (but also with other MD codes thanks to KIM interface) has a lot of overlap with AEnet, functionally. 

We have spent a good deal of time for accelerated execution on GPUs (TF makes this easier), interfacing with Quantum Espresso seamlessly, and offering visualization during training; and lately for embedding electrostatics as well as constructing hybrid potentials (force field+NNS) that work with MD packages. 

Yet we lack quite a few good features AEnet has, such as Chebyschev descriptors or selective force training as we still train with all force components. (One has only 24h a day, as Nong mentions, human time is often a bottleneck) 

PANNA is an open source project just like AENet and I think at this point it would be best for the community to join forces and truly accelerate the NNP in material science research, in particular with the urgent needs of the climate crisis looming over. I have been a developer in the electronic structure community for years and I have seen first hand how research groups working in silos slows the development down for everyone, and I think given the urgency, we need a better way.

What do you think, can we break this cycle as the NNP community?

Thanks Daniel for the opportunity to bring this up, and Nong for the down to earth response. I would love to hear from all AEnet developers and user community what they would think about this, whether they think the diversity of tools as-is is more helpful than such an effort, and whether they would be willing to contribute to such an endeavour. 
I am open to thinking this through publicly or privately, however you feel most comfortable. Please don't hesitate to reach out.

With best wishes;
--
Emine Kucukbenli (they/them)
Clin. Asst. Prof. - Information Systems Dept., Questrom School of Business, Boston University
Associate - Harvard School of Engineering and Applied Sciences

--
You received this message because you are subscribed to the Google Groups "aenet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aenet+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aenet/4c6e31cf-2b15-48a3-91b3-c50a3e5093can%40googlegroups.com.

Artrith, N. (Nong)

unread,
Aug 19, 2021, 2:44:06 PM8/19/21
to Emine Kucukbenli, Daniel Massote, aenet, Emine Kucukbenli, Emine Kucukbenli, Artrith, N. (Nong), Alexander Urban
Dear Emine, Daniel, and ænet community,

In my opinion, there is value in having multiple open source codes developed in parallel (i.e., like Quantum Espresso, Abinit, etc. for DFT). In the case of PANNA and ænet, I think it is great that we have implementations in Python and in Fortran because both have advantages and disadvantages.

But I agree that it would make a lot of sense to exploit synergies. I think it would be good to work on some compatibility between all MLP codes. As I mentioned earlier, the vision is that we share data and code as much as we can with the community, so that the next generation can further the development and focus on their own research (and do not waste time reproducing old research).  This is in the spirit of Figure 9 in my J. Phys. Energy 1, (2019) 032002Open Access (or see attached) .

If we could provide documented data and model formats much better, we could work together on converters that make the data/models available to different tools. There is also potential to share some code in form of libraries. I think, this would be a big step for the community.  For example, we could think of having (i) standards for model parameters, machine-learning method definitions, and descriptors; (ii) protocols for automatic reference data generation and potential training/testing; and (iii) repositories for the collection of energy models, benchmarks, and data sets.

I am also open for discussing this offline if anyone would be interested.

Best regards,
Nong

=========================
Nongnuch Artrith, 
Assistant Professor
Debye Institute for Nanomaterials Science, Utrecht University
David de Wiedgebouw, Office 4.66, Universiteitsweg 99, 3584 CG Utrecht
Email: n.ar...@uu.nl
Phone: +31-6-28-15-97-99
Web: https://www.uu.nl/staff/NArtrith



2019-Artrith-jpenergyab2060f9_hr.jpg
Reply all
Reply to author
Forward
0 new messages