DIA NN compatibility

1,506 views
Skip to first unread message

Klemens Fröhlich

unread,
Aug 9, 2020, 9:47:06 AM8/9/20
to MSstats
dear msstats team,

I am currently testing the software published last year DIA NN, which gives me the following output (see below or attached tsv)

the fragmention quantitation seems to be reported separately but without the info which ions these are.... so I cannot produce a "clean" 10 column format for analysis, or I might be missing something of course!

Can I still analyse this data somehow with MSStats?

Best, Klemens






File.Name Protein.Group Protein.Ids Protein.Names Genes PG.Quantity PG.Normalised Genes.Quantity Genes.Normalised Genes.MaxLFQ Genes.MaxLFQ.Unique Modified.Sequence Stripped.Sequence Precursor.Id Precursor.Charge Q.Value Protein.Q.Value PG.Q.Value GG.Q.Value Proteotypic Precursor.Quantity Precursor.Normalised Label.Ratio RT RT.Start RT.Stop iRT Predicted.RT Predicted.iRT First.Protein.Description Lib.Q.Value Ms1.Profile.Corr Ms1.Area Evidence CScore Decoy.Evidence Decoy.CScore Fragment.Quant.Raw Fragment.Quant.Corrected Fragment.Correlations MS2.Scan
N:\MFA\MFA654-665_HEK_Ecoli_DIA_with_iRT\MFA654A.mzML P55011 P55011 S12A2_HUMAN SLC12A2 282504 286084 282504 286084 281322 281322 AAAAAAAAAAAAAAAGAGAGAK AAAAAAAAAAAAAAAGAGAGAK AAAAAAAAAAAAAAAGAGAGAK3 3 0.000256954 0.000455996 0.000415282 0.000416146 1 282504 286084 0 470.075 463.992 475.178 467.287 471.841 465.636 Solute carrier family 12 member 2 0.00293904 0.714551 469640 283.739 0.996095 196.418 0.432299 148771;177351;97033.6;79471.5;61490.9;20711.2; 93985.1;91485.5;97033.6;21597.1;26216.2;0; 0.398117;0.499402;0.962261;0.275779;0.0164255;0.3561; 47339
N:\MFA\MFA654-665_HEK_Ecoli_DIA_with_iRT\MFA654B.mzML P55011 P55011 S12A2_HUMAN SLC12A2 226368 191725 226368 191725 339003 339003 AAAAAAAAAAAAAAAGAGAGAK AAAAAAAAAAAAAAAGAGAGAK AAAAAAAAAAAAAAAGAGAGAK3 3 0.00310668 0.000392465 0.000358551 0.000359583 1 61822.8 89437.2 0 464.929 461.946 46.9 467.287 471.444 461.051 Solute carrier family 12 member 2 0.00293904 0.562793 156150 274.787 0.856456 205.955 0.29705 23905;54934;43340.2;27749.3;45590.4;18059.1; 7064.6;11418;43340.2;0;0;10600.4; 0.173492;0.27878;0.877492;-0.25966;0.365566;0.389279; 46676
N:\MFA\MFA654-665_HEK_Ecoli_DIA_with_iRT\MFA655A.mzML P55011 P55011 S12A2_HUMAN SLC12A2 129249 225992 129249 225992 295599 295599 AAAAAAAAAAAAAAAGAGAGAK AAAAAAAAAAAAAAAGAGAGAK AAAAAAAAAAAAAAAGAGAGAK3 3 0.000117041 0.0003885 0.000356379 0.00035727 1 129249 225992 0 467.287 464.264 470.914 467.287 471.423 463.335 Solute carrier family 12 member 2 0.00293904 0.895236 289665 480.766 0.996374 288.883 0.41737 66757.5;56977;45692.5;31070.5;34237.7;24600.9; 41328.3;42227.7;45692.5;17194;20302.5;11549.1; 0.579655;0.778587;0.936687;0.67858;0.539728;0.595337; 46523



head_DIANN.txt

froehlic...@gmail.com

unread,
Sep 1, 2020, 10:35:10 AM9/1/20
to MSstats
So I just set every quantitation to the transition y5 and proceeded with the analysis of a benchmark dataset.

the overall analysis worked well, but there were some really crazy outliers in the data ( log2FC >100 ).

Appreciate any input on the matter!

Best, Klemens

froehlic...@gmail.com

unread,
Sep 18, 2020, 4:39:06 AM9/18/20
to MSstats
Okay, I finally found the time to look closer at DIA NN and msstats analysis. 
My mistake was to take the wrong quantitation column which resulted in this weird outlier problem.
However, the main problem still consists: I do have the fragment quant but I dont know which ions were quantified

DIA NN however seems to report the same number of ions for every precursor even if the quant for one transition is empty
7

So I split up the quant info in extra columns and named these columns with y5 to y10
Even though the fragment ion annotation is artificial, the corresponding ions are always annotated the same way.

basically: 
DIA NN output
Exp     Pep    Frag.Quant
Run1  pep1  10,15
Run2  pep1  20,30

becomes
Exp     Pep    y5  y6
Run1  pep1  10  15
Run2  pep1  20  30  


I will post how the analysis will go if anybody else wants to use DIA NN and msstats.

Best, Klemens

Rayner Queiroz

unread,
Feb 18, 2021, 8:21:35 PM2/18/21
to MSstats
Dear Klemens,

I am trying to use DIA-NN myself now. Would you mind sharing your script with me?

best

Rayner

Mateusz Staniak

unread,
Feb 19, 2021, 7:21:09 AM2/19/21
to MSstats
Hi,


I plan to add a converter for DIA-NN output to MSstats. I don't have column definitions, yet, do you know where can I find them? Which column do you use for quantification?


Kind regards,
Mateusz

Miguel Cosenza

unread,
Feb 19, 2021, 8:46:16 AM2/19/21
to MSstats
Hello,

I sent you both some answers directly via email but I actually wanted to share them in this forum thread so other people can look at it if they find it interesting:

Hello Rayner,

This function/script in the link is a modified version of the script from Klemens.

https://github.com/MiguelCos/MSstats_labelfree_preprocessing/blob/master/R/diann2msstats.R

It uses Genes.MaxQ.LFQ as the quantitative variable per feature.


Hello Mateusz,

I believe the documentation of DIA-NN has this information.

https://github.com/vdemichev/DiaNN/blob/master/DIA-NN%20GUI%20manual.pdf

In our lab, we are using the columns labeled as "Genes.MaxLFQ" which have the MaxLFQ quantitation info per feature.

Best wishes,
Miguel

froehlic...@gmail.com

unread,
Feb 19, 2021, 8:59:41 AM2/19/21
to MSstats
Hi

as miguel just posted the documentation can be found on github
all the PG.Quant and LFQs are on protein level.
Precursor level evidence seems to be:
Precursor.Quantity
Precursor.Normalised
Ms1.Area

fragment level quant seems to be:
Fragment.Quant.Raw
Fragment.Quant.Corrected

the precursor quantity does seem to be the sum of all fragment raws...... 

If you want to do protein summarization in msstats and not directly use DIA-NN's protein output, then pragmatically the Precursor.Quant should be correct?
Vadim (the main developer of DIA-NN) answered to me on github within minutes last time I had a question.

Best, Klemens

Fernando Tobias

unread,
Jul 6, 2021, 11:34:54 PM7/6/21
to MSstats
Hello-

I'm looking for a way to use MSStats for analysis DIA-NN outputs. 

I haven't been able to read get the DIA-NN to MSStats preprocessing script to work. Is the input for the scrip the report.tsv from DIA-NN?

Thanks,
Fernando

Mateusz Staniak

unread,
Jul 7, 2021, 7:38:54 AM7/7/21
to MSstats
Hi,

what's the problem with the script?
DIA-NN converter is high on my priorities list, but unfortunately I didn't have a chance to write it, yet

Best,
Mateusz

Fernando Tobias

unread,
Jul 7, 2021, 4:26:45 PM7/7/21
to MSstats
Hello Mateusz,

I attempted to use the function that Miguel shared: 

I'm not quite sure if the function can directly read the Main output report that DIA-NN creates. You will have to excuse me, but my R skills are quite limited. 

I tried running the function after defining diann_data and annotation_file.

> diann_data <- read.delim(file = "C:/DIA-NN/1.8/report_2.tsv")
> annotation_file <- read.csv(file = "C:/DIA-NN/1.8/lynch_diapasef_annotation.csv")

> DIANN_to_MSstats()
Error in mutate(diann_data, File.Name = str_replace(diann_data[[1]], ".*\\\\",  : 
  argument "diann_data" is missing, with no default

Happy to take any advice on converting the report at this time.

Fernando

Sam Siljee

unread,
Nov 16, 2023, 9:11:28 PM11/16/23
to MSstats
Hi Miguel,

With your permission, I'd like to adapt it for use in my own analysis program.

Best regards,
Sam

Mateusz Staniak

unread,
Nov 17, 2023, 1:59:57 PM11/17/23
to MSstats
Hi,


there is a DIA-NN in MSstats release version now


Kind regards,
Mateusz Staniak

Sam Siljee

unread,
Nov 17, 2023, 3:34:05 PM11/17/23
to MSstats
Thank you Mateusz,
This is very helpful! Unfortunately so far I've only managed to return a dataframe with no rows, but I'll keep trying.

Sam

Mateusz Staniak

unread,
Nov 17, 2023, 3:39:52 PM11/17/23
to MSstats
Hi,

are you using the latest versions of MSstats and MSstatsConvert? At one point there was a bug in the converter that caused empty outputs in some cases, but it was fixed. If your problem persists, could you kindly share a reproducible example of your problem?


Kind regards,
Mateusz Staniak

Sam Siljee

unread,
Nov 17, 2023, 5:58:59 PM11/17/23
to MSstats
Quick response!
I was using MSstatsConvert_1.10.0 and MSstats_4.8.0, I've now updated to MSstatsConvert_1.12.0 and MSstats_4.10.0 which has solved the problem!

Thank you so much for the support by the way, one of the many things that makes MSstats so fantastic!

Sam
Reply all
Reply to author
Forward
0 new messages