convert corrupt thermo raw file

735 views
Skip to first unread message

dbl...@gmail.com

unread,
Oct 12, 2016, 8:22:52 AM10/12/16
to ProteomicsQA

Dear Proteomics users,


I would like to convert a thermo raw file which is unfortunately corrupt. Due to a power surge the computer shut down, but the LC and the MS didn't. So what I've got is a 320 mb file that Xcalibur/MS files reader can't open. Unfortunately, this is the most important sample I've ever dealt with since it is an unrecoverable and unique ancient sample.
I've tried to convert from Thermo RAW to mzML using the GUI and the CLI of msconvert, but I get the same error message "unable to open file". Is there ANY way I can read and convert some of this data? 
Maybe there is some kind of end tag these raw files have that I can add to the corrupted file to make it legible? I'm thinking in terms of 
Please let me know if you have any suggestions or tipps! 
Thank you very much for your time and support!


Best regards,
David

Vladimir Gorshkov

unread,
Oct 12, 2016, 10:36:48 AM10/12/16
to ProteomicsQA
 Dear David,

I am afraid there is no such tool, that is designed specifically to "cure" corrupted Thermo RAW files. The file, probably, can be accessed programmatically scan by scan, however, since the access is provided through libraries supplied by Thermo the hope is little.
You can try Ufinnigan - https://code.google.com/archive/p/unfinnigan/ - it is a reader for raw files, that doesn't use Thermo libraries, however, the development was stopped the long time ago, thus, it can open only fairly old files.

Best regards,
Vladimir

David Bouyssié

unread,
Oct 12, 2016, 5:14:30 PM10/12/16
to ProteomicsQA
I like challenges :D

But I think Vladimir is right, it is not very easy to deal with corrupted raw files.

I have never used unfinnigan, but I made a similar work 10 years ago during my PhD to develop a raw data processing tool (https://www.ncbi.nlm.nih.gov/pubmed/17533220).
I could try to use my own Perl scripts or eventually unfinnigan to check if a partial recovery is possible.

Best,

David


David Bouyssié

unread,
Oct 19, 2016, 3:44:26 AM10/19/16
to ProteomicsQA
So to summarize the status of the RAW file data recovery.
We were able to recover MS and MS/MS spectra. However some meta-data are lost because they are usually stored at the end of the RAW file.
These meta-data include the retention time of the spectra and, in this case of MS/MS spectra, the m/z value of the precursor ion.

So the next step is to find a method able to infer the precursor ion m/z value.
I guess that a small portion of the precursor ion may be found inside the MS/MS spectra.
But this is another topic !

jokirkp...@googlemail.com

unread,
Oct 19, 2016, 5:16:34 AM10/19/16
to ProteomicsQA
Hi David,

I don't know which instrument you have the data from, but there used to be a trimming tool in XCalibur (at least in version 2.1) that I had managed one or two times in the past to recover unfinished raw data files (it doesn't always work, sadly).  It was found under FT Programs (from the start menu) and is called Slicer tool.  If I recall, one could then pick a retention time for the number of minutes the data had run, just before the instrument shut down, and slice the raw file into a new raw file with a proper 'end' at that time, that should then be open-able.  Worth a shot?  I no longer have the software (as running a Lumos) but anyone running LTQ Velos might still have it - or Thermo might help you?

Cheers
Joanna

Marc Vaudel

unread,
Oct 19, 2016, 5:37:36 AM10/19/16
to ProteomicsQA
Very impressive!

Maybe Bullseye (https://proteome.gs.washington.edu/software/bullseye/index.html) would do the trick? Otherwise have an extra wide ms1 tolerance search and filter out the PSMs based on the ms1 spectrum? As you say, this is another question :)

Marc

David Lyon

unread,
Oct 19, 2016, 6:03:14 AM10/19/16
to ProteomicsQA
Dear all,

Yes indeed very impressive work from David. Thanks again!

Also many thanks to Vladimir for his programming tips!

Thank you Joanna for the tip! The instrument is a Q Exactive. I couldn't find Slicer tool (Xcalibur v 3.0.63) but I'll keep looking and will get back to you if I'm successful.

Thank you Marc for the Bullseye tip! Without precursor info I think the search space would be too large (due to unspecific cleavage, degraded proteins --> many PTMs, etc.). But maybe Bullseye can resolve this. 

It seems we might have another shot at measuring this precious sample after all (flow through from STAGE tips were kept as well as a small aliquot of the original sample, which usually isn't the case with these samples). 

This is a great community. I'm very grateful for all your support!! 

Best regards,
David






--
You received this message because you are subscribed to a topic in the Google Groups "ProteomicsQA" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/proteomicsqa/mGDQC_G7GOA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to proteomicsqa+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/proteomicsqa/fd3c7716-45b9-419c-9b8c-aefd7672c601%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Vladimir Gorshkov

unread,
Oct 21, 2016, 2:30:31 PM10/21/16
to ProteomicsQA, dbl...@gmail.com
Hi David,

is the precursor ion information lost completely or only the mass (i.e the isolation window is there). In the later case you can try approach similar to DeMix https://www.ncbi.nlm.nih.gov/pubmed/25100859 Make a list of all peptide-like features in MS1 and use the list to search the corresponding MS/MS spectra.

Best regards,
Vladimir

Jarka Stankova

unread,
Nov 17, 2021, 7:42:15 AM11/17/21
to ProteomicsQA
Dear David, 

Have you been successful with the data recovery? I accidentally delete my thermo raw data, I managed to recovery them but I am not able to open them. They seems to be corrupt. 

Many thanks, 
Jarka

Dne středa 12. října 2016 v 14:22:52 UTC+2 uživatel dbl...@gmail.com napsal:
Reply all
Reply to author
Forward
0 new messages