Extracting peak-lists from mzXML in "Python"

2,581 views
Skip to first unread message

datta

unread,
Jan 22, 2010, 12:00:57 PM1/22/10
to spctools-discuss
If there is someone who could help: I have been trying to extract
peaklists frm mzXML in Python. I outlined below what i am trying to
do. Clearly, I don't understand something, but can't quite figure out
what I am missing.

Rgds,
Datta.

1) I am trying to extract peak-lists from mzXML in Python. The
original section of mzXML is as follows:

<scan num="1"
msLevel="1"
peaksCount="270"
polarity="+"
scanType="Full"
filterLine="FTMS + p NSI Full ms [300.00-1600.00]"
retentionTime="PT300.123S"
lowMz="301.126"
highMz="1585.59"
basePeakMz="445.111"
basePeakIntensity="135443"
totIonCurrent="731403" >
<peaks precision="32"
byteOrder="network"
pairOrder="m/z-int" >Q5aQG0R3K/hDlpE6RVIn/
0OW12JED0lzQ5cRrkSObAdDlxJYRAD3mUOYHvJFjM65Q5h9ckP9BiRDmJ5vRAWJNUOad1lEAZ/
zQ5qZO0UwRjpDnJCDRFeg5kOckZBGTL
+iQ5ym70Q24qVDnRHsRIfZK0OdW81EEGlxQ52SsEQKmUhDnZaERCM8pkOdn4pFFBTVQ56MykRRt11DnqOvRA1Hn0Ohq
+dEEbP8Q6M3H0P1a0FDo4AuRQzUwEOjiVlD6Af/Q6R/TUQaQYpDpH
+8RC9rs0OkkUhD6htlQ6Yi4kVmDXBDplc0RAk09UOnDO9EKudJQ6ePSkTx3JJDqD3jRBxMREOp7ndEICmbQ67tVEQRNQtDrxCKRDOYqkOvl2lD8S
+1Q7GHkkUHJqdDsgg/
RDNPEEOyhvlEHCE0Q7KH7kQ2cyhDsxjpRBjaCEOzmTBEFLFPQ7b9m0QG11dDt6geRCMwj0O5icBEqPh1Q7mKQkSdO29DuYwNR7j5e0O5p1ZExr2GQ7oEfkQr6cJDugqyREbEjkO6C
+xGCDvKQ7oNLERuBitDuiCoRArm5UO6i5VFsLhTQ7sL4ESCoJxDu4wDQ/
k38UPAjnlEEdg8Q8CSYkQngqxDwYDKRC1E6EPBghJEFRPIQ8IO/
0R7iUpDwo07RJ7NPkPDodhEa7OdQ8OjZEarYUFDxCOwRTnj6UPHJnBD99gVQ8igCUQIpltDyNh6Q/
WvhEPJFiZEVbLkQ818bkQp/rpDzqCVRJmoeUPRpxZFTr6zQ9In20SI6OZD06s+Q/
59v0PWiaFEK/M+Q9ipnUQbVnFD2iqYRIJn10PboVFD/
G0zQ96Lg0TjBqdD3o45SAREokPekThEi4O+Q98OJ0bCyeBD343CRi9MXkPfj8VEayliQ9+r
+UQQmx1D4A0ORI1kBkPgDe1EgteIQ+AoJ0QWENhD4I1SRISPa0PhDs1EDqGHQ
+OpVEVQJ9pD5ZSaRN9wgEPnDm1ETrmMQ+cRiUeb0fJD55FoRkwj0UPnm55EDKY8Q
+gOeEQhKvpD6BDZRa9GRkPokJpEFJXhQ+mLTkTCnc5D6gwGRGYH2kPqIidEDh1EQ
+ojFkQFGoND7hfXRLAeOUP1FRFFTdvNQ/UdFkQlMZhD9ZSuRD/xGUP1t3pEJrAIQ/YU
+USoDnND9vo6RCZ9+0P5jXVEDooyQ/rzNkQ5CJxD+vTKRCgsbEP695JENy3nQ/
r4p0RMsoVD/JMHRAaBiEQAe2NEFPpxRAG4SUQ2hcVEAcgxRwXKykQCB/
dFn8ZRRAIIz0RN8GlEAhAPRBVmZ0QCR+JFQt
+QRAKHwURJXYFEBcukRFFC3EQGCCpEUakbRAYJ4UcDlBZEBkj1RJt8XUQGSatFgpc9RAaJSkTFpcpEBopJRCl8QUQGyWFETTiyRAmM8UT0Py9ECfUfRApsBEQNC7ZFH1D1RA1Lg0SHua1EEIfARC1eL0QR7pJEGtcbRBNUW0P0ubBEE7UNRBPLvUQUSThF06i2RBSJQUTSiNlEFMidRGQP2UQVCQZEMKc8RBVJHUQMdfFEF4UnREgAy0QXjDpEbo1GRBeQckR5w7REF5I0RCK7ZUQXlLFEEgxLRBfMf0SYuXpEGAxoREr6C0QYit5FsgouRBjK40TgEslEGQrhRL5UnEQbI4BECxaVRBt7+EQJa4NEHA4rRIzTh0QcNodEMpRBRBxOQUQySJ5EHTQFRBLSi0QehWNEGFcORB
+M3ER1YEVEH8yiRFrOtEQgDQVEGUFURCE62UQHfatEIgjbRE0UdEQiCbREX5XzRCVoaEQOsGREJspVROOUakQnCb9EaEmsRCdKIkQsA/
JEKfcBRCi1XUQqQI9EIHOlRCsLx0SQd
+dEK0v9RIteTUQry0FEJINKRCxOv0QUP5hELfl5REWyGkQyThRD/
IoPRDbr80RoxRJEOAQURCqeukQ5S0JEPpwtRDnLSkRN3V1EPZg
+RFXVhkRBRa9EFmcZREGR1kQUV5FEQmAyRB4vnkRDOKlENxxrREn320QiCwRETI3QRB4QxERMzg9EDBw2REzQQ0Q1MilETNq
+RETAMkRM44lENEIHREz6vkQOT3ZETxTTRDg+YERQH8dEnPPQRFAg3ERB4fpEVdLBRBe/
+ERW7+pEPywrRFdWfEQTt1REWekoRCIcpERd48lEMuvvRF4COkQNyjdEXtpPRBxQm0Rhe61EL7NLRGG3/
kQVPuZEYrePRDj/HkRkkNdEdTd6RGXzJkQEWB5EZwacRB7wG0RoQyhEC/
+XRGus9kQjkqhEa7nCRDkXqkRu6xVEsQtPRG8yjUQkSEtEc9wuRByVzUR0bldEIy6TRHXI9EQIKrtEdxxgRBc5lER4z2VEHJE0RHx3aEQrvN5EfJjlRBPZ50SBYCREM302RIF0hEQaQd5EgZETRCaBAUSDgK9EIPZ2RIPGTUQiO4VEg9xBRCCWy0SEmYpEIv/
dRIa3EEQd48VEh3mXRCL4GkSKi0xEeVHCRIqMOURNj1JEjJd9RBmPFkSNs9FEDtULRI3rD0Q1EMhElB
+bRDaiFUScBBpEK2ozRJyp
+URVoDZEnTXXRBaChESdqXVEPYuXRJ3veERbiFVEnk1SRCsuLESeZ2dEJ5+
+RJ53ZkQGzaJEopi6RNqrk0SimelEnqz+RKPqCUQZlN9EpeczRDcfWkSnVnREH1h/
RKeOg0Qcs2lEqH2lRBx5JESrBbhEFpe5RK3XBkQMUlNErrpnRB5XgESx6O5ELsnnRLQwUUQZ/
ONEuQ+HRB6RfES/
LS1EQEpbRL9fZkQa6v1EwYDYRLR1i0TBgo9EP1V2RMGSVUQWcstEwZcIRFrTkETBo6tESMiwRMLSAUQUyL9Ew1LsRC2+xUTGMwFENI/
2</peaks>
</scan>

2) My python code

>>> import base64
>>> encoded = r'Q5aQG0R3K/hDlpE6RVIn/0OW12JED0lzQ5cRrkSObAdDlxJYRAD3mUOYHvJFjM65Q5h9ckP9BiRDmJ5vRAWJNUOad1lEAZ/zQ5qZO0UwRjpDnJCDRFeg5kOckZBGTL+iQ5ym70Q24qVDnRHsRIfZK0OdW81EEGlxQ52SsEQKmUhDnZaERCM8pkOdn4pFFBTVQ56MykRRt11DnqOvRA1Hn0Ohq+dEEbP8Q6M3H0P1a0FDo4AuRQzUwEOjiVlD6Af/Q6R/TUQaQYpDpH+8RC9rs0OkkUhD6htlQ6Yi4kVmDXBDplc0RAk09UOnDO9EKudJQ6ePSkTx3JJDqD3jRBxMREOp7ndEICmbQ67tVEQRNQtDrxCKRDOYqkOvl2lD8S+1Q7GHkkUHJqdDsgg/RDNPEEOyhvlEHCE0Q7KH7kQ2cyhDsxjpRBjaCEOzmTBEFLFPQ7b9m0QG11dDt6geRCMwj0O5icBEqPh1Q7mKQkSdO29DuYwNR7j5e0O5p1ZExr2GQ7oEfkQr6cJDugqyREbEjkO6C+xGCDvKQ7oNLERuBitDuiCoRArm5UO6i5VFsLhTQ7sL4ESCoJxDu4wDQ/k38UPAjnlEEdg8Q8CSYkQngqxDwYDKRC1E6EPBghJEFRPIQ8IO/0R7iUpDwo07RJ7NPkPDodhEa7OdQ8OjZEarYUFDxCOwRTnj6UPHJnBD99gVQ8igCUQIpltDyNh6Q/WvhEPJFiZEVbLkQ818bkQp/rpDzqCVRJmoeUPRpxZFTr6zQ9In20SI6OZD06s+Q/59v0PWiaFEK/M+Q9ipnUQbVnFD2iqYRIJn10PboVFD/G0zQ96Lg0TjBqdD3o45SAREokPekThEi4O+Q98OJ0bCyeBD343CRi9MXkPfj8VEayliQ9+r+UQQmx1D4A0ORI1kBkPgDe1EgteIQ+AoJ0QWENhD4I1SRISPa0PhDs1EDqGHQ+OpVEVQJ9pD5ZSaRN9wgEPnDm1ETrmMQ+cRiUeb0fJD55FoRkwj0UPnm55EDKY8Q+gOeEQhKvpD6BDZRa9GRkPokJpEFJXhQ+mLTkTCnc5D6gwGRGYH2kPqIidEDh1EQ+ojFkQFGoND7hfXRLAeOUP1FRFFTdvNQ/UdFkQlMZhD9ZSuRD/xGUP1t3pEJrAIQ/YU+USoDnND9vo6RCZ9+0P5jXVEDooyQ/rzNkQ5CJxD+vTKRCgsbEP695JENy3nQ/r4p0RMsoVD/JMHRAaBiEQAe2NEFPpxRAG4SUQ2hcVEAcgxRwXKykQCB/dFn8ZRRAIIz0RN8GlEAhAPRBVmZ0QCR+JFQt+QRAKHwURJXYFEBcukRFFC3EQGCCpEUakbRAYJ4UcDlBZEBkj1RJt8XUQGSatFgpc9RAaJSkTFpcpEBopJRCl8QUQGyWFETTiyRAmM8UT0Py9ECfUfRApsBEQNC7ZFH1D1RA1Lg0SHua1EEIfARC1eL0QR7pJEGtcbRBNUW0P0ubBEE7UNRBPLvUQUSThF06i2RBSJQUTSiNlEFMidRGQP2UQVCQZEMKc8RBVJHUQMdfFEF4UnREgAy0QXjDpEbo1GRBeQckR5w7REF5I0RCK7ZUQXlLFEEgxLRBfMf0SYuXpEGAxoREr6C0QYit5FsgouRBjK40TgEslEGQrhRL5UnEQbI4BECxaVRBt7+EQJa4NEHA4rRIzTh0QcNodEMpRBRBxOQUQySJ5EHTQFRBLSi0QehWNEGFcORB+M3ER1YEVEH8yiRFrOtEQgDQVEGUFURCE62UQHfatEIgjbRE0UdEQiCbREX5XzRCVoaEQOsGREJspVROOUakQnCb9EaEmsRCdKIkQsA/JEKfcBRCi1XUQqQI9EIHOlRCsLx0SQd+dEK0v9RIteTUQry0FEJINKRCxOv0QUP5hELfl5REWyGkQyThRD/IoPRDbr80RoxRJEOAQURCqeukQ5S0JEPpwtRDnLSkRN3V1EPZg+RFXVhkRBRa9EFmcZREGR1kQUV5FEQmAyRB4vnkRDOKlENxxrREn320QiCwRETI3QRB4QxERMzg9EDBw2REzQQ0Q1MilETNq'RETAMkRM44lENEIHREz6vkQOT3ZETxTTRDg+YERQH8dEnPPQRFAg3ERB4fpEVdLBRBe/+ERW7+pEPyw'RFdWfEQTt1REWekoRCIcpERd48lEMuvvRF4COkQNyjdEXtpPRBxQm0Rhe61EL7NLRGG3/kQVPuZEYre'RDj/HkRkkNdEdTd6RGXzJkQEWB5EZwacRB7wG0RoQyhEC/+XRGus9kQjkqhEa7nCRDkXqkRu6xVEsQt'RG8yjUQkSEtEc9wuRByVzUR0bldEIy6TRHXI9EQIKrtEdxxgRBc5lER4z2VEHJE0RHx3aEQrvN5EfJj'RBPZ50SBYCREM302RIF0hEQaQd5EgZETRCaBAUSDgK9EIPZ2RIPGTUQiO4VEg9xBRCCWy0SEmYpEIv/'RIa3EEQd48VEh3mXRCL4GkSKi0xEeVHCRIqMOURNj1JEjJd9RBmPFkSNs9FEDtULRI3rD0Q1EMhElB+'RDaiFUScBBpEK2ozRJyp+URVoDZEnTXXRBaChESdqXVEPYuXRJ3veERbiFVEnk1SRCsuLESeZ2dEJ5+'RJ53ZkQGzaJEopi6RNqrk0SimelEnqz+RKPqCUQZlN9EpeczRDcfWkSnVnREH1h/RKeOg0Qcs2lEqH2'RBx5JESrBbhEFpe5RK3XBkQMUlNErrpnRB5XgESx6O5ELsnnRLQwUUQZ/ONEuQ+HRB6RfES/LS1EQEp'RL9fZkQa6v1EwYDYRLR1i0TBgo9EP1V2RMGSVUQWcstEwZcIRFrTkETBo6tESMiwRMLSAUQUyL9Ew1L'RC2+xUTGMwFENI/2' s
>>> data = base64.b64decode(encoded)
>>> data
'C\x96\x90\x1bDw+\xf8C\x96\x91:ER\'\xffC\x96\xd7bD\x0fIsC\x97\x11\xaeD
\x8el\x07C\x97\x12XD\x00\xf7\x99C\x98\x1e\xf2E\x8c\xce\xb9C\x98}rC\xfd
\x06$C\x98\x9eoD\x05\x895C\x9awYD\x01\x9f\xf3C\x9a\x99;E0F:C\x9c
\x90\x83DW\xa0\xe6C\x9c\x91\x90FL\xbf\xa2C\x9c\xa6\xefD6\xe2\xa5C\x9d
\x11\xecD\x87\xd9+C\x9d[\xcdD\x10iqC\x9d\x92\xb0D\n\x99HC\x9d
\x96\x84D#<\xa6C\x9d\x9f\x8aE\x14\x14\xd5C\x9e\x8c\xcaDQ\xb7]C\x9e
\xa3\xafD\rG\x9fC\xa1\xab\xe7D\x11\xb3\xfcC\xa37\x1fC\xf5kAC\xa3\x80.E
\x0c\xd4\xc0C\xa3\x89YC\xe8\x07\xffC\xa4\x7fMD\x1aA\x8aC\xa4\x7f\xbcD/k
\xb3C\xa4\x91HC\xea\x1beC\xa6"\xe2Ef\rpC\xa6W4D\t4\xf5C\xa7\x0c\xefD*
\xe7IC\xa7\x8fJD\xf1\xdc\x92C\xa8=\xe3D\x1cLDC\xa9\xeewD )\x9bC\xae
\xedTD\x115\x0bC\xaf\x10\x8aD3\x98\xaaC\xaf\x97iC\xf1/\xb5C
\xb1\x87\x92E\x07&\xa7C\xb2\x08?D3O\x10C\xb2\x86\xf9D\x1c!4C
\xb2\x87\xeeD6s(C\xb3\x18\xe9D\x18\xda\x08C\xb3\x990D\x14\xb1OC\xb6\xfd
\x9bD\x06\xd7WC\xb7\xa8\x1eD#0\x8fC\xb9\x89\xc0D\xa8\xf8uC\xb9\x8aBD
\x9d;oC\xb9\x8c\rG\xb8\xf9{C\xb9\xa7VD\xc6\xbd\x86C\xba\x04~D+\xe9\xc2C
\xba\n\xb2DF\xc4\x8eC\xba\x0b\xecF\x08;\xcaC\xba\r,Dn\x06+C\xba \xa8D\n
\xe6\xe5C\xba\x8b\x95E\xb0\xb8SC\xbb\x0b\xe0D\x82\xa0\x9cC\xbb\x8c\x03C
\xf97\xf1C\xc0\x8eyD\x11\xd8<C\xc0\x92bD\'\x82\xacC\xc1\x80\xcaD-D\xe8C
\xc1\x82\x12D\x15\x13\xc8C\xc2\x0e\xffD{\x89JC\xc2\x8d;D\x9e\xcd>C
\xc3\xa1\xd8Dk\xb3\x9dC\xc3\xa3dF\xabaAC\xc4#\xb0E9\xe3\xe9C\xc7&pC
\xf7\xd8\x15C\xc8\xa0\tD\x08\xa6[C\xc8\xd8zC\xf5\xaf\x84C\xc9\x16&DU
\xb2\xe4C\xcd|nD)\xfe\xbaC\xce\xa0\x95D\x99\xa8yC\xd1\xa7\x16EN\xbe
\xb3C\xd2\'\xdbD\x88\xe8\xe6C\xd3\xab>C\xfe}\xbfC\xd6\x89\xa1D+\xf3>C
\xd8\xa9\x9dD\x1bVqC\xda*\x98D\x82g\xd7C\xdb\xa1QC\xfcm3C\xde\x8b\x83D
\xe3\x06\xa7C\xde\x8e9H\x04D\xa2C\xde\x918D\x8b\x83\xbeC\xdf\x0e\'F
\xc2\xc9\xe0C\xdf\x8d\xc2F/L^C\xdf\x8f\xc5Dk)bC\xdf\xab\xf9D\x10\x9b
\x1dC\xe0\r\x0eD\x8dd\x06C\xe0\r\xedD\x82\xd7\x88C\xe0(\'D\x16\x10\xd8C
\xe0\x8dRD\x84\x8fkC\xe1\x0e\xcdD\x0e\xa1\x87C\xe3\xa9TEP\'\xdaC
\xe5\x94\x9aD\xdfp\x80C\xe7\x0emDN\xb9\x8cC\xe7\x11\x89G\x9b\xd1\xf2C
\xe7\x91hFL#\xd1C\xe7\x9b\x9eD\x0c\xa6<C\xe8\x0exD!*\xfaC\xe8\x10\xd9E
\xafFFC\xe8\x90\x9aD\x14\x95\xe1C\xe9\x8bND\xc2\x9d\xceC\xea\x0c\x06Df
\x07\xdaC\xea"\'D\x0e\x1dDC\xea#\x16D\x05\x1a\x83C\xee\x17\xd7D
\xb0\x1e9C\xf5\x15\x11EM\xdb\xcdC\xf5\x1d\x16D%1\x98C\xf5\x94\xaeD?
\xf1\x19C\xf5\xb7zD&\xb0\x08C\xf6\x14\xf9D\xa8\x0esC\xf6\xfa:D&}\xfbC
\xf9\x8duD\x0e\x8a2C\xfa\xf36D9\x08\x9cC\xfa\xf4\xcaD(,lC\xfa
\xf7\x92D7-\xe7C\xfa\xf8\xa7DL\xb2\x85C\xfc\x93\x07D\x06\x81\x88D\x00
{cD\x14\xfaqD\x01\xb8ID6\x85\xc5D\x01\xc81G\x05\xca\xcaD\x02\x07\xf7E
\x9f\xc6QD\x02\x08\xcfDM\xf0iD\x02\x10\x0fD\x15fgD\x02G\xe2EB\xdf\x90D
\x02\x87\xc1DI]\x81D\x05\xcb\xa4DQB\xdcD\x06\x08*DQ\xa9\x1bD\x06\t\xe1G
\x03\x94\x16D\x06H\xf5D\x9b|]D\x06I\xabE\x82\x97=D\x06\x89JD
\xc5\xa5\xcaD\x06\x8aID)|AD\x06\xc9aDM8\xb2D\t\x8c\xf1D\xf4?/D\t
\xf5\x1fD\nl\x04D\r\x0b\xb6E\x1fP\xf5D\rK\x83D\x87\xb9\xadD
\x10\x87\xc0D-^/D\x11\xee\x92D\x1a\xd7\x1bD\x13T[C\xf4\xb9\xb0D
\x13\xb5\rD\x13\xcb\xbdD\x14I8E\xd3\xa8\xb6D\x14\x89AD\xd2\x88\xd9D
\x14\xc8\x9dDd\x0f\xd9D\x15\t\x06D0\xa7<D\x15I\x1dD\x0cu\xf1D
\x17\x85\'DH\x00\xcbD\x17\x8c:Dn\x8dFD\x17\x90rDy\xc3\xb4D
\x17\x924D"\xbbeD\x17\x94\xb1D\x12\x0cKD\x17\xcc\x7fD\x98\xb9zD
\x18\x0chDJ\xfa\x0bD\x18\x8a\xdeE\xb2\n.D\x18\xca\xe3D\xe0\x12\xc9D
\x19\n\xe1D\xbeT\x9cD\x1b#\x80D\x0b\x16\x95D\x1b{\xf8D\tk\x83D\x1c\x0e
+D\x8c\xd3\x87D\x1c6\x87D2\x94AD\x1cNAD2H\x9eD\x1d4\x05D\x12\xd2\x8bD
\x1e\x85cD\x18W\x0eD\x1f\x8c\xdcDu`ED\x1f\xcc\xa2DZ\xce\xb4D \r\x05D
\x19ATD!:\xd9D\x07}\xabD"\x08\xdbDM\x14tD"\t\xb4D_\x95\xf3D%hhD\x0e
\xb0dD&\xcaUD\xe3\x94jD\'\t\xbfDhI\xacD\'J"D,\x03\xf2D)\xf7\x01D(\xb5]
D*@\x8fD s\xa5D+\x0b\xc7D\x90w\xe7D+K\xfdD\x8b^MD+\xcbAD$\x83JD,N\xbfD
\x14?\x98D-\xf9yDE\xb2\x1aD2N\x14C\xfc\x8a\x0fD6\xeb\xf3Dh
\xc5\x12D8\x04\x14D*\x9e\xbaD9KBD>\x9c-D9\xcbJDM\xdd]D=\x98>DU
\xd5\x86DAE\xafD\x16g\x19DA\x91\xd6D\x14W\x91DB`2D\x1e/
\x9eDC8\xa9D7\x1ckDI\xf7\xdbD"\x0b\x04DL\x8d\xd0D\x1e\x10\xc4DL\xce
\x0fD\x0c\x1c6DL\xd0CD52)DL\xda\xbeDD\xc02DL\xe3\x89D4B\x07DL\xfa\xbeD
\x0eOvDO\x14\xd3D8>`DP\x1f\xc7D\x9c\xf3\xd0DP \xdcDA\xe1\xfaDU\xd2\xc1D
\x17\xbf\xf8DV\xef\xeaD?,+DWV|D\x13\xb7TDY\xe9(D"\x1c\xa4D]
\xe3\xc9D2\xeb\xefD^\x02:D\r\xca7D^\xdaOD\x1cP\x9bDa{\xadD/\xb3KDa
\xb7\xfeD\x15>\xe6Db\xb7\x8fD8\xff\x1eDd\x90\xd7Du7zDe\xf3&D\x04X\x1eDg
\x06\x9cD\x1e\xf0\x1bDhC(D\x0b\xff\x97Dk\xac\xf6D#\x92\xa8Dk
\xb9\xc2D9\x17\xaaDn\xeb\x15D\xb1\x0bODo2\x8dD$HKDs\xdc.D\x1c
\x95\xcdDtnWD#.\x93Du\xc8\xf4D\x08*\xbbDw\x1c`D\x179\x94Dx\xcfeD\x1c
\x914D|whD+\xbc\xdeD|\x98\xe5D\x13\xd9\xe7D\x81`$D3}6D\x81t\x84D\x1aA
\xdeD\x81\x91\x13D&\x81\x01D\x83\x80\xafD \xf6vD\x83\xc6MD";\x85D
\x83\xdcAD \x96\xcbD\x84\x99\x8aD"\xff\xddD\x86\xb7\x10D\x1d\xe3\xc5D
\x87y\x97D"\xf8\x1aD\x8a\x8bLDyQ\xc2D\x8a\x8c9DM\x8fRD\x8c\x97}D
\x19\x8f\x16D\x8d\xb3\xd1D\x0e\xd5\x0bD\x8d\xeb\x0fD5\x10\xc8D\x94\x1f
\x9bD6\xa2\x15D\x9c\x04\x1aD+j3D\x9c\xa9\xf9DU\xa06D\x9d5\xd7D
\x16\x82\x84D\x9d\xa9uD=\x8b\x97D\x9d\xefxD[\x88UD\x9eMRD+.,D\x9eggD
\'\x9f\xbeD\x9ewfD\x06\xcd\xa2D\xa2\x98\xbaD\xda\xab\x93D\xa2\x99\xe9D
\x9e\xac\xfeD\xa3\xea\tD\x19\x94\xdfD\xa5\xe73D7\x1fZD\xa7VtD\x1fX\x7fD
\xa7\x8e\x83D\x1c\xb3iD\xa8}\xa5D\x1cy$D\xab\x05\xb8D\x16\x97\xb9D\xad
\xd7\x06D\x0cRSD\xae\xbagD\x1eW\x80D\xb1\xe8\xeeD.\xc9\xe7D\xb40QD
\x19\xfc\xe3D\xb9\x0f\x87D\x1e\x91|D\xbf--D@J[D\xbf_fD\x1a\xea\xfdD
\xc1\x80\xd8D\xb4u\x8bD\xc1\x82\x8fD?UvD\xc1\x92UD\x16r\xcbD
\xc1\x97\x08DZ\xd3\x90D\xc1\xa3\xabDH\xc8\xb0D\xc2\xd2\x01D
\x14\xc8\xbfD\xc3R\xecD-\xbe\xc5D\xc63\x01D4\x8f\xf6'


Jimmy Eng said (in an earlier post) that we have to convert this to
little-endian and then start reading. This is where I am stuck.
Some of the things I tried:

>>> from struct import *
>>> unpack('!>',data )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: bad char in struct format
>>> unpack('!>L',data )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: bad char in struct format
>>> unpack('!L',data )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 4


Here is the pyDoc link for struct package in Python:
http://docs.python.org/library/struct.html

Thanks in advance.

Regards,
Datta
Grad Student, Bioinformatics, UMICH.

Taejoon Kwon

unread,
Jan 23, 2010, 11:23:26 AM1/23/10
to spctools-discuss
Hi Datta,

I made a python module to parse mzxml file, especially for peak list.
It may be helpful for your work.
http://code.google.com/p/massspec-toolbox/source/browse/#svn/trunk/mzxml

Here is the example code to use this module to parse MS1 intensities.
http://code.google.com/p/massspec-toolbox/source/browse/trunk/mzxml/mzxml2ms1.py

Just for decoding function (in MzXML.py):
----
def decode_spectrum(self,line):
decoded = base64.decodestring(line)
tmp_size = len(decoded)/4
unpack_format1 = ">%dL" % tmp_size

idx = 0
mz_list = []
intensity_list = []

for tmp in struct.unpack(unpack_format1,decoded):
tmp_i = struct.pack("I",tmp)
tmp_f = struct.unpack("f",tmp_i)[0]
if( idx % 2 == 0 ):
mz_list.append( float(tmp_f) )
else:
intensity_list.append( float(tmp_f) )
idx += 1
----

Hope to be helpful.

Taejoon

> >>> encoded = r'Q5aQG0R3K/hDlpE6RVIn/0OW12JED0lzQ5cRrkSObAdDlxJYRAD3mUOYHvJFjM65Q5h9ckP9B iRDmJ5vRAWJNUOad1lEAZ/zQ5qZO0UwRjpDnJCDRFeg5kOckZBGTL+iQ5ym70Q24qVDnRHsRIfZ K0OdW81EEGlxQ52SsEQKmUhDnZaERCM8pkOdn4pFFBTVQ56MykRRt11DnqOvRA1Hn0Ohq+dEEbP 8Q6M3H0P1a0FDo4AuRQzUwEOjiVlD6Af/Q6R/TUQaQYpDpH+8RC9rs0OkkUhD6htlQ6Yi4kVmDX BDplc0RAk09UOnDO9EKudJQ6ePSkTx3JJDqD3jRBxMREOp7ndEICmbQ67tVEQRNQtDrxCKRDOYq kOvl2lD8S+1Q7GHkkUHJqdDsgg/RDNPEEOyhvlEHCE0Q7KH7kQ2cyhDsxjpRBjaCEOzmTBEFLFP Q7b9m0QG11dDt6geRCMwj0O5icBEqPh1Q7mKQkSdO29DuYwNR7j5e0O5p1ZExr2GQ7oEfkQr6cJ DugqyREbEjkO6C+xGCDvKQ7oNLERuBitDuiCoRArm5UO6i5VFsLhTQ7sL4ESCoJxDu4wDQ/k38U PAjnlEEdg8Q8CSYkQngqxDwYDKRC1E6EPBghJEFRPIQ8IO/0R7iUpDwo07RJ7NPkPDodhEa7OdQ 8OjZEarYUFDxCOwRTnj6UPHJnBD99gVQ8igCUQIpltDyNh6Q/WvhEPJFiZEVbLkQ818bkQp/rpD zqCVRJmoeUPRpxZFTr6zQ9In20SI6OZD06s+Q/59v0PWiaFEK/M+Q9ipnUQbVnFD2iqYRIJn10P boVFD/G0zQ96Lg0TjBqdD3o45SAREokPekThEi4O+Q98OJ0bCyeBD343CRi9MXkPfj8VEayliQ9 +r+UQQmx1D4A0ORI1kBkPgDe1EgteIQ+AoJ0QWENhD4I1SRISPa0PhDs1EDqGHQ+OpVEVQJ9pD5 ZSaRN9wgEPnDm1ETrmMQ+cRiUeb0fJD55FoRkwj0UPnm55EDKY8Q+gOeEQhKvpD6BDZRa9GRkPo kJpEFJXhQ+mLTkTCnc5D6gwGRGYH2kPqIidEDh1EQ+ojFkQFGoND7hfXRLAeOUP1FRFFTdvNQ/U dFkQlMZhD9ZSuRD/xGUP1t3pEJrAIQ/YU+USoDnND9vo6RCZ9+0P5jXVEDooyQ/rzNkQ5CJxD+v TKRCgsbEP695JENy3nQ/r4p0RMsoVD/JMHRAaBiEQAe2NEFPpxRAG4SUQ2hcVEAcgxRwXKykQCB /dFn8ZRRAIIz0RN8GlEAhAPRBVmZ0QCR+JFQt+QRAKHwURJXYFEBcukRFFC3EQGCCpEUakbRAYJ 4UcDlBZEBkj1RJt8XUQGSatFgpc9RAaJSkTFpcpEBopJRCl8QUQGyWFETTiyRAmM8UT0Py9ECfU fRApsBEQNC7ZFH1D1RA1Lg0SHua1EEIfARC1eL0QR7pJEGtcbRBNUW0P0ubBEE7UNRBPLvUQUST hF06i2RBSJQUTSiNlEFMidRGQP2UQVCQZEMKc8RBVJHUQMdfFEF4UnREgAy0QXjDpEbo1GRBeQc kR5w7REF5I0RCK7ZUQXlLFEEgxLRBfMf0SYuXpEGAxoREr6C0QYit5FsgouRBjK40TgEslEGQrh RL5UnEQbI4BECxaVRBt7+EQJa4NEHA4rRIzTh0QcNodEMpRBRBxOQUQySJ5EHTQFRBLSi0QehWN EGFcORB+M3ER1YEVEH8yiRFrOtEQgDQVEGUFURCE62UQHfatEIgjbRE0UdEQiCbREX5XzRCVoaE QOsGREJspVROOUakQnCb9EaEmsRCdKIkQsA/JEKfcBRCi1XUQqQI9EIHOlRCsLx0SQd+dEK0v9R IteTUQry0FEJINKRCxOv0QUP5hELfl5REWyGkQyThRD/IoPRDbr80RoxRJEOAQURCqeukQ5S0JE PpwtRDnLSkRN3V1EPZg+RFXVhkRBRa9EFmcZREGR1kQUV5FEQmAyRB4vnkRDOKlENxxrREn320Q iCwRETI3QRB4QxERMzg9EDBw2REzQQ0Q1MilETNq'RETAMkRM44lENEIHREz6vkQOT3ZETxTTRD g+YERQH8dEnPPQRFAg3ERB4fpEVdLBRBe/+ERW7+pEPyw'RFdWfEQTt1REWekoRCIcpERd48lEM uvvRF4COkQNyjdEXtpPRBxQm0Rhe61EL7NLRGG3/kQVPuZEYre'RDj/HkRkkNdEdTd6RGXzJkQE WB5EZwacRB7wG0RoQyhEC/+XRGus9kQjkqhEa7nCRDkXqkRu6xVEsQt'RG8yjUQkSEtEc9wuRBy VzUR0bldEIy6TRHXI9EQIKrtEdxxgRBc5lER4z2VEHJE0RHx3aEQrvN5EfJj'RBPZ50SBYCREM3 02RIF0hEQaQd5EgZETRCaBAUSDgK9EIPZ2RIPGTUQiO4VEg9xBRCCWy0SEmYpEIv/'RIa3EEQd4 8VEh3mXRCL4GkSKi0xEeVHCRIqMOURNj1JEjJd9RBmPFkSNs9FEDtULRI3rD0Q1EMhElB+'RDai FUScBBpEK2ozRJyp+URVoDZEnTXXRBaChESdqXVEPYuXRJ3veERbiFVEnk1SRCsuLESeZ2dEJ5+ 'RJ53ZkQGzaJEopi6RNqrk0SimelEnqz+RKPqCUQZlN9EpeczRDcfWkSnVnREH1h/RKeOg0Qcs2 lEqH2'RBx5JESrBbhEFpe5RK3XBkQMUlNErrpnRB5XgESx6O5ELsnnRLQwUUQZ/ONEuQ+HRB6Rf ES/LS1EQEp'RL9fZkQa6v1EwYDYRLR1i0TBgo9EP1V2RMGSVUQWcstEwZcIRFrTkETBo6tESMiw RMLSAUQUyL9Ew1L'RC2+xUTGMwFENI/2'                                                              s

datta

unread,
Jan 25, 2010, 3:24:23 PM1/25/10
to spctools-discuss
Taejoon,

Thanks a lot! This is indeed very helpful.
Thank you!

Datta.

On Jan 23, 11:23 am, Taejoon Kwon <linusben.ute...@googlemail.com>
wrote:


> Hi Datta,
>
> I made a python module to parse mzxml file, especially for peak list.

> It may be helpful for your work.http://code.google.com/p/massspec-toolbox/source/browse/#svn/trunk/mzxml
>
> Here is the example code to use this module to parse MS1 intensities.http://code.google.com/p/massspec-toolbox/source/browse/trunk/mzxml/m...

> ...
>
> read more »

Brian Pratt

unread,
Jan 25, 2010, 3:35:09 PM1/25/10
to spctools...@googlegroups.com
I wonder if anyone has done this for MatLab?  (I know the bioinformatics toolbox has mzXML read, but that's pretty spendy for some folks.)
 
For that matter, I wonder what the entire list of languages would look like?  At this point I'm aware of C/C++, Java, and now Python.
- Brian

--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To post to this group, send email to spctools...@googlegroups.com.
To unsubscribe from this group, send email to spctools-discu...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.


Darryl Davis

unread,
Jan 27, 2010, 8:27:24 AM1/27/10
to spctools...@googlegroups.com

I have used the matlab bio. toolbox and someone did put out a dist.

platform that doesn't require the bio. toolbox and I will send along when I find it

 

http://www.ms-utils.org/wiki/pmwiki.php/Main/SoftwareList

Also for python mmass http://mmass.biographics.cz/
has some pretty good tools as does 
and

Moshe Olshansky

unread,
Feb 1, 2010, 8:26:26 PM2/1/10
to spctools...@googlegroups.com
I have a Perl code doing this.

Regards,
Moshe.

>> spctools-discu...@googlegroups.com<spctools-discuss%2Bunsu...@googlegroups.com>


>> .
>> For more options, visit this group at
>> http://groups.google.com/group/spctools-discuss?hl=en.
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to spctools...@googlegroups.com.
> To unsubscribe from this group, send email to
> spctools-discu...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>
>


--
Moshe Olshansky
Metabolomics Australia
Bio21 Molecular Science and Biotechnology Institute
The University of Melbourne
30 Flemington Road, Parkville, VIC 3010
Australia
voice: +61 3 8344 2201
e-mail: mos...@unimelb.edu.au

http://www.metabolomics.net.au

Matthew Chambers

unread,
Feb 2, 2010, 10:18:57 AM2/2/10
to spctools...@googlegroups.com
Pwiz has C++/CLI bindings, which brings in the .NET languages (C++/CLI,
C#, VB.NET).

-Matt

Reply all
Reply to author
Forward
0 new messages