colin...@googlemail.com
unread,Feb 9, 2018, 10:05:48 AM2/9/18Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Pyteomics
Hi,
thank you for your work on this useful library, however, there is something about it that confuses me.
We wish to iterate through all the DBSequences and use code as follows:
sequence_collection = mzid_reader.iterfind('SequenceCollection').next()
for sequence in sequence_collection['DBSequence']:
#do stuff with sequence
where mzid_reader is an instance of the MzIdentML class.
It works and it seems to be the fastest way to do it for large files, we think because there can only be one SequenceCollection element and this way loads its contents into memory for fast iteration without necessarily going through the entire file. (Any comments on this most welcome.)
However, if I use the same pattern to iterate through the AnalysisSoftware it doesn't work as expected. If I write:
analysis_software_list = mzid_reader.iterfind('AnalysisSoftwareList').next()
then the dict returned only contains the first AnalysisSoftware element, not a collection of them, even it there is more than one of them under AnalysisSoftwareList. It seems the behaviour is different from when I asked for iterfind('SequenceCollection').
Could somebody explain this please? It may be some simple mistake or misunderstanding as I am new to python and this library.
best wishes,
Colin