Thanks for the note!
I actually never worked with a PDB file that contained multiple structures and didn't know that this was possible in the official PDP spec.
In any case, it should definitely be handled in one way or the other. Currently, I don't have any best idea on how to handle that and would welcome and thoughts and feedback (let me cross-post that on the GitHub issue tracker -- maybe better to continue the discussion about potential ways to implement it there).
I think one of the problems with the DataFrame format is that having them all in one DataFrame would probably result in a lot of weird -- or unexpected -- results, thus it would probably best to separate the structures one way or the other ...
1) One option would be to provide a utility function (analogous to the split_multimol2 function,
http://rasbt.github.io/biopandas/tutorials/Working_with_MOL2_Structures_in_DataFrames/#parsing-multi-mol2-files) that generates multiple PandasPdb objects from such a file. I.e., it would simply be a list
pdbs = [pdb_1, pdb_2, .... pdb_n]
which would preserve the current functionality of the library without any e.g., backwards-incompatible changes. This would then also help with using the multiprocessing library more easily and efficiently for the analysis of multiple PandasPdb objects in parallel.
2) Right now, the PandasPdb objects have a dictionary containing multiple DataFrames
dict_keys(['ATOM', 'HETATM', 'ANISOU', 'OTHERS'])
For multi-PDB files, the dictionary could be expanded to
dict_keys(['ATOM_1', 'HETATM_1', 'ANISOU_1', 'OTHERS_1', 'ATOM_2', 'HETATM_2', 'ANISOU_2', 'OTHERS_2', ...])
I strongly favor scenario 1) though; however, I would love to hear feedback on this and are open to other suggestions!
In any case, also an error (or at least a warning) should be raised if MODEL & ENDMDL tags are found in a PDB file if the current read_pdb method is used such that this doesn't lead to any unexpected behavior.
Best,
Sebastian
> --
> You received this message because you are subscribed to the Google Groups "BioPandas-Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
biopandas-use...@googlegroups.com.
> To post to this group, send email to
biopand...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/biopandas-users/742073b9-67de-46de-8e17-17c133141e27%40googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.