Same lnNumDSP values for all targets and all decoys after tide-search

31 views
Skip to first unread message

Abhishek Dubey

unread,
Apr 2, 2019, 10:05:46 AM4/2/19
to crux-users
I was using crux to analyze MALDI MS/MS data. Strangely, tide-search output has one same value of lnNumDSP for all targets and other same value for all decoy.

Because of which, I am not able to process the results further using percolator.

Kindly suggest what is the issue and how can it be tackled.

William S Noble

unread,
Apr 2, 2019, 11:46:14 AM4/2/19
to Abhishek Dubey, crux-users
Can you send details on how you generated your index and ran your search? It would be great if you could send command lines, log files and sample input files so we can generate this behavior locally.

Thanks.
Bill


--
You received this message because you are subscribed to the Google Groups "crux-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crux-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Abhishek Dubey

unread,
Apr 6, 2019, 6:54:07 AM4/6/19
to crux-users
Following is the link to google drive folder where the mzML file, the database fasta file, the crux-output folder etc. has been uploaded.




On Tuesday, April 2, 2019 at 9:16:14 PM UTC+5:30, Bill Noble wrote:
Can you send details on how you generated your index and ran your search? It would be great if you could send command lines, log files and sample input files so we can generate this behavior locally.

Thanks.
Bill


On Tue, Apr 2, 2019 at 7:06 AM Abhishek Dubey <abhishekdub...@gmail.com> wrote:
I was using crux to analyze MALDI MS/MS data. Strangely, tide-search output has one same value of lnNumDSP for all targets  and other same value for all decoy.

Because of which, I am not able to process the results further using percolator.

Kindly suggest what is the issue and how can it be tackled.

--
You received this message because you are subscribed to the Google Groups "crux-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crux-...@googlegroups.com.

William S Noble

unread,
Apr 12, 2019, 1:23:10 PM4/12/19
to Abhishek Dubey, crux-users
Thanks for sending the input files. I have put this on our issue tracker:


We will send an update once we reproduce your result.

Bill


To unsubscribe from this group and stop receiving emails from it, send an email to crux-users+...@googlegroups.com.

Abhishek Dubey

unread,
Feb 20, 2020, 4:26:53 AM2/20/20
to crux-users
Any update?

Charles E Grant

unread,
Feb 26, 2020, 4:49:22 PM2/26/20
to crux-users
Hi Abhishek,

On Thursday, February 20, 2020 at 1:26:53 AM UTC-8, Abhishek Dubey wrote:
Any update?

Our apologies for letting this get lost in our queue! 

Using the files you provided I was able to reproduce the problem with seeing the same value of lnNumDSP for all matches. However, when I examined the spectrum file you provided (651.mzML) I noticed that all the spectra had a precursor m/z of 651.0. Given the name of the file I assume this was deliberate. You also specified a precursor window of 2.0.  This means that for each spectrum, the only peptides that will be considered as candidates will be in the mass range 651 +/- 2. Note that in the tide-search.target.txt file the values in the column 'distinct matches/spectrum' are all 77. This means that your protein database contains only 77 peptides in that mass range, and each spectrum gets compared to the same 77 peptides. lnNumDSP is just the natural log of the distinct matches / spectrum.

You could increase the number of candidate peptides by increasing the precursor window, but since each spectrum has the same precursor mass, you are always going to get the same number of candidate peptides for each spectrum.

 

Charles E Grant

unread,
Mar 4, 2020, 6:04:56 PM3/4/20
to Abhishek Dubey, crux-users
Hi Abhishek,

Be sure to send support questions about Crux to the Crux Users mailing list. That way they’ll be seen by multiple people. I’m just a software developer with a very limited knowledge of mass spec.

> On Feb 28, 2020, at 8:28 AM, Abhishek Dubey <abhishekdub...@gmail.com> wrote:
>
> I had one further doubt which is not directly related to crux. I do Mass Spec on Waters instrument. The way MS/MS mode works is that in one file, the fragmentation spectra of one peak is recorded ( there also multiple scans ). When analysing multiple peaks, do we have to merge the fragmentation spectras from different different peaks and then perform the analysis ? if yes, then how do you do it ? ( I briefly looked at MaraCluster from Lukas Kall ? ).

I’m afraid no one in our group has much experience working with that instrument, but yes, to get an accurate estimation of statistical confidence using Crux, you would need to combine the spectra into a single file. There is no facility in Crux for doing this. You should double check the Waters documentation to see if they have software settings that would do this. You might also look at msconvert by the Proteowizard group:

http://proteowizard.sourceforge.net/tools/msconvert.html

It seems to have a ‘--merge’ option.

You’d have to ask the Kall group about MaraCluster, but my initial impress is that it is not a file merging tool, but a tool for identifying spectra corresponding to the same underlying peptide.

Charles

Abhishek Dubey

unread,
Mar 9, 2020, 7:27:29 AM3/9/20
to crux-users
I had another file with which I encountered the same problem. Even when I kept the precursor window 100, then also the lnNumDSP of all target were same while all decoy were same.

Please have a look at the files. I am attaching database-index, .mzml file and the text file containing the command I ran. 

Abhishek Dubey

unread,
Mar 9, 2020, 9:27:04 AM3/9/20
to crux-users
I am attaching four files : 1112.mzML, 1453.mzML, 1925.mzML, and combined.mzML

The combined file contains spectrums from all the three files combined in one file.

When I am analyzing the combined file, I am not getting same lnNumDSP for all target and all decoys BUT when analyzing the files individually with exactly the same settings, I am getting the same lnNumDSP for all target and all decoys in 1453.mzML and 1925.mzML.

I find this strange. 

Please find the files here :  files 

Best
Abhishek

Charles E Grant

unread,
Mar 10, 2020, 6:59:31 PM3/10/20
to crux-users
Hi Abhishek, look at the response to your next question for an answer. (They're really the same question).


Charles E Grant

unread,
Mar 10, 2020, 7:12:29 PM3/10/20
to crux-users
Hi Abhishek,


On Monday, March 9, 2020 at 6:27:04 AM UTC-7, Abhishek Dubey wrote:
 
I am attaching four files : 1112.mzML, 1453.mzML, 1925.mzML, and combined.mzML

The files 1112.mzML, 1453.mzML, 1925.mzM were attached, but there's no combined.mzML
 
The combined file contains spectrums from all the three files combined in one file.

When I am analyzing the combined file, I am not getting same lnNumDSP for all target and all decoys BUT when analyzing the files individually with exactly the same settings, I am getting the same lnNumDSP for all target and all decoys in 1453.mzML and 1925.mzML.

I find this strange. 

Actually this is exactly what I would expect. lnNumDSP represents the log of the number of distinct peptides that were considered as candidates for the peptide-spectra match. The candidate peptides are just those peptides derived from your protein database that are within the specified m/z window of the precursor m/z. The m/z window is a fixed value for any run of Crux. If all the input spectra have the same precursor m/z, then they'll all be compared with the same set of candidate peptides, and they'll all have the same value of lnNumDSP. Now, if you combine spectra with different precursor m/z into a single file, then each value of the precursor m/z will correspond to a different set of candidate peptides, and you'll have different values of lnNumDSP for the different precursor m/z. You should observe that peptide spectra matches with the same precursor m/z have the same lnNumDSP.

Abhishek Dubey

unread,
Mar 11, 2020, 1:35:16 AM3/11/20
to crux-users
Thank you for your reply. I got more understanding of lnNumDSP. Now my last question is that due to it being one same value for all target and one same value for all decoy, I WAS NOT ABLE TO PROCESS IN Percolator which was taking it as one of the feature to rank psms. 

It seems to me then that lnNumDSP can be removed for processing with Percolator. Will that be a good choice ? 

From the pooled file of the three precursor ions, I was able to get some hits. So, I will stick to that for analysis. 

Best
Abhishek
Reply all
Reply to author
Forward
0 new messages