Hi,
a)Should
the unit cell be repeated before considering the cutoff sphere for
symmetry functions?(I have this doubt because the xsf files contains
co-ordinates of periodic structures,and I think that the atoms in the
neighbouring unit cells can also come inside the cutoff sphere for a
particular atom and contribute towards its structural fingerprint.)
I can't remember if AENet itself uses a ghost atom system, but if you are in doubt it never hurts to expand the box size within reason. You train time shouldn't increase by that much if you scale up the box size to say 30-100 atoms instead of 4-6. So long as your boxes are at least 2x the cutoff it takes care of the first problem which is incorrect energies for a periodic system.
However, in my experience I will also say if your boxes are small enough that you're worried about that you might run into another problem which is finite size effects. That is when you have periodic boxes that are too small it's mathematically impossible to create certain atom arrangements. Larger systems have more statistical diversity than smaller systems which is highly important for training a neural network. To give you a simple example, a box with 1 atom and a fixed box length the distance between the atom and its periodic image can never change.
If your training set only deals with small box configurations very often when you go to apply your neural network to a real system you'll find it falls apart because you never trained it against configurations you will experience in the larger real system.
b)By going through one of the questions in this group,I
came to know that the reference energy commented in the xsf is not the
structural energy,but the cohesive energy.So can I compare this cohesive
energy with the sum of outputs of the atomic NNs (ΣEi) to get the cost function(RMSE or MAE)?
(In other words,is the cohesive energy and the E_ref same ?)
As far as the XSF, the answer is it's both. It depends on how you define it in your input.
The reference in the XSF file is by default the total energy of the system, but you can also define it as the cohesive energy with a bit of magic in the input scripts. In the AENet input scripts you'll find a place you can specify the isolated atom DFT energies for each atom type in your system.
If you set the isolated atom energies of all atoms to 0, then the total energy in the file will be the cohesive energy.
E_coh = E_total - N_atom1*E_atom1
- N_atom2*E_atom2....
If
E_atom1=
E_atom2 = .... = 0 then
E_coh = E_total
This is how you would define it if you are training a neural
network against a classical forcefield where the isolated atoms are
defined to be 0 by convention. That's a consequence of the cohesive
energy equation.
So it's possible to either put the cohesive energy or the total energy in the XSF files so long as the isolated atom energies in the input file match. Though when training against DFT or other quantum methods it's a bit better in my opinion to supply the total structure energies and the isolated atom energies in case you have to re-compute the structural energies or isolated atom energies for whatever reason.
AENet will compute and return the cohesive energy along with the total energy prediction. As far as the cost function is concerned it should be identical for cohesive and total energy if they both the AENet and DFT Reference use the same isolated atom reference since the isolated atom energies will drop out of the equation when you subtract it.
-Troy Loeffler
Center for Nanoscale Materials
Argonne National Lab.