Writing xyz trajectory with unit cell info

468 views
Skip to first unread message

Carlos Bornes

unread,
Sep 12, 2022, 9:22:03 AM9/12/22
to MDnalysis discussion
Dear all,

I'm trying to convert a pdb trajectory file into a XYZ version to making compatible with another tool (not that it is important, but the tool is the ASE).

I've tried to write the file using the following code and in that case, I don't get any information about the unit cell

u = mda.Universe('MFI-pos-1.pdb', 'MFI-pos-1.pdb', format = 'CHEMFILES', guess_bonds= True, vdwradii=({'Si':2.1 , 'Al':1.84}))
atoms = u.select_atoms('all')
with mda.Writer('MFI_test.xyz', atoms.n_atoms) as W:
    for ts in u.trajectory:
        W.write(atoms)


I also tried using the chemfiles format:

u = mda.Universe('MFI-pos-1.pdb', 'MFI-pos-1.pdb', format = 'CHEMFILES', guess_bonds= True, vdwradii=({'Si':2.1 , 'Al':1.84}))
atoms = u.select_atoms('all')
with mda.Writer('MFI_Al_int_int_2TMPO_1site_AIMD_423-pos-1_test_chemfiles.xyz', atoms.n_atoms, format = 'CHEMFILES') as W:
    for ts in u.trajectory:
        W.write(atoms)


But in that case, I get a lot of unnecessary information that creates a conflict with ASE. 

606
Lattice="20.078 0 0 0 19.894 0 0 0 26.744"
O 7.244 1.154 9.662  SYSTEM 0
O 2.792 18.68 2.964  SYSTEM 0
O 12.824 11.157 3.736  SYSTEM 0
O 17.233 8.855 10.424  SYSTEM 0
O 12.831 18.688 3.736  SYSTEM 0
O 17.241 1.138 10.47  SYSTEM 0
O 7.223 8.713 9.599  SYSTEM 0
....

If I convert it into this it works perfectly.

606
Lattice="20.078 0 0 0 19.894 0 0 0 26.744"
O 7.244 1.154 9.662
O 2.792 18.68 2.964
O 12.824 11.157 3.736
O 17.233 8.855 10.424
O 12.831 18.688 3.736
O 17.241 1.138 10.47
O 7.223 8.713 9.599
....

With all that said is there a way to write a XYZ trajectory file with the unit cell parameters? Or do you suggest using another file version? The pdb seems to be a no go for ASE.

Regards,
Carlos

Hugo Macdermott-Opeskin

unread,
Sep 13, 2022, 10:17:34 PM9/13/22
to MDnalysis discussion
Hi Carlos 

Sorry to hear you have had trouble converting your files.

We really should have a parser for the EXTXYZ format which i think is what you are refering to (https://wiki.fysik.dtu.dk/ase/ase/io/formatoptions.html#extxyz)?

I have also been looking into writing an interface for ASE but this would be a long way off (thanks for reminding me to raise and issue). 

I would say that your best bet is the GROMACS .gro format which has box info and can be written and read by both ASE and MDA.

Let me know if you have any more issues.

Cheers


Hugo 

Hugo Macdermott-Opeskin

unread,
Sep 14, 2022, 4:02:19 AM9/14/22
to MDnalysis discussion
Addendum ASE does seem to have read and write support for PDB under the heading `proteindatabank` https://wiki.fysik.dtu.dk/ase/ase/io/formatoptions.html#proteindatabank

Guillaume Fraux

unread,
Sep 14, 2022, 5:25:23 AM9/14/22
to MDnalysis discussion
Hi Carlos,

Could you share the PDB file you are trying to convert?

What's happening here is that there are more atom properties in the data than ASE expect (the extra "SYSTEM 0"), and reading fails.
These properties should be defined in a `Properties` field in the header that you did not include in your message, and that ASE seems to have an issue parsing.

There should be a way to tell chemfiles not to include these properties in the output, but I need to check your file first.

Cheers,
Guillaume

Carlos Bornes

unread,
Sep 14, 2022, 5:48:13 AM9/14/22
to mdnalysis-...@googlegroups.com
Dear Guillaume

I've tried to read the pdb using ASE but I guess the code I'm using is writing less information than needed for ASE reading formatting

Anyway I'm attaching part of the trajectory file

Best,
Carlos

--
You received this message because you are subscribed to a topic in the Google Groups "MDnalysis discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mdnalysis-discussion/rlR0gcPxOV0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mdnalysis-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mdnalysis-discussion/e51454a3-f31f-4853-8a86-b21c719a80f2n%40googlegroups.com.
MFI-pos-1.pdb

Guillaume Fraux

unread,
Sep 14, 2022, 6:04:35 AM9/14/22
to mdnalysis-...@googlegroups.com
Ok, looks like you found a bug!

What's happening is that chemfiles is adding an empty `altloc` property (since there is none defined in your PDB file), and then trying to output it in the file.
This results in properties being defined as `Properties=species:S:1:pos:R:3:altloc:S:1:segid:S:1:segindex:R:1`, but the altloc field is an empty string, which makes the file unreadable.

So there are multiple temporary fixes for you:
    - manually change `Properties=species:S:1:pos:R:3:altloc:S:1:segid:S:1:segindex:R:1` to `Properties=species:S:1:pos:R:3:segid:S:1:segindex:R:1` everywhere
    - manually change each line from `O 7.241 1.162 9.63  SYSTEM 0` to `O 7.241 1.162 9.63 "" SYSTEM 0` everywhere
    - use chemfiles directly for the conversion, since we don't add empty altloc in this case:

from chemfiles import Trajectory
file = Trajectory("MFI-pos-1.pdb")
output = Trajectory("converted.xyz", "w")
for frame in file:
    # if you need it, although XYZ does not store bonds
    # frame.guess_bonds()

    output.write(frame)

I'll have a look at fixing this both on the chemfiles side (adding quotes around empty strings) and mda side (don't add altloc if it is not defined in the file?).

Cheers,
Guillaume


Carlos Bornes a écrit le 14.09.22 à 11:47 :
Reply all
Reply to author
Forward
0 new messages