DIA-NN 2.0 parquet file support

141 views
Skip to first unread message

Marcel

unread,
Feb 24, 2025, 8:01:24 AMFeb 24
to MSstats
Hi,

does MSstats or MSstatsShiny supports the parquet file from DIA-NN 2.0 ?
The "standard" DIA-NN report is no longer created by DIA-NN. Additionally the columns "Fragment.Quant.Raw" and "Fragment.Correlations" are not included in the parquet file. (maybe due to the change of the quantification strategy).

Will there be a new function in the Package ‘MSstatsConvert’ available to handle the new DIA-NN output? Which columns are needed for conversion?

Or maybe I´m missing some output parameter settings in DIA-NN..

Best regards,
Marcel

Witold Szymanski

unread,
Feb 24, 2025, 8:07:17 AMFeb 24
to MSstats
Hi Marcel!

Afaik, you can still set Dia-NN to save the report as a .tsc, just by changing the file format extension in the gui:

C:\MS_DATA\temp\report.tsv

But, the new report have a completely different structure, different names of the columns etc. so the MSStats people need to adjust the reader for sure.

Greets
Witek

Anthony Wu

unread,
Feb 28, 2025, 3:07:19 PMFeb 28
to MSstats
Hi,

To assist with the different structure, could you attach a sample parquet report / tsv report with the different structure?  

Thanks,
Tony

Marcel

unread,
Mar 10, 2025, 11:55:56 AMMar 10
to MSstats
Hi Tony,

Thank you very much.

Here is sample parquet report.

Best
Marcel
DIA-NN20_report.parquet

Anthony Wu

unread,
Mar 12, 2025, 4:23:25 PMMar 12
to MSstats
Hi Marcel,

By any chance, does the new version of DIANN output any files that include the fragment ion intensities.  From the parquet file provided, there don't seem to be any columns indicating fragment ion intensities.  

Thanks,
Tony

Anatoly U.

unread,
May 5, 2025, 7:43:06 PMMay 5
to MSstats
You can output fragment data with --export-quant flag in DIA-NN 2.0 (thanks to Vadim Demichev for the suggestion - https://github.com/vdemichev/DiaNN/discussions/1525) but it is not the usual long format organized by fragment, it is organized by precursor. Here is an example of such output...
I wounder if it would be possible to bypass the fragment level analysis in MSstats and start at the precursor level to be compatible with DIA-NN 2.0 and make the data processing less time-intensive.
BTW, parquet file  can be read into MSstats with the package "arrow".
Thank you, MSstats team!
Anatoly
report.parquet

Anatoly U.

unread,
May 8, 2025, 10:37:56 AMMay 8
to MSstats
Please see my post on DIA-NN Github page for a workaround that works https://github.com/vdemichev/DiaNN/discussions/1525 
Anatoly

Anthony Wu

unread,
May 13, 2025, 1:35:44 PMMay 13
to MSstats
Hi,

If you want to make data processing less time intensive, you can set the `featureSubset` parameter to "topN" in dataProcess.  And then set `n_top_feature` to determine the number of fragment ions to use for summarization (we've generally set this value to 50).

Tony

eduard...@crg.cat

unread,
Jul 14, 2025, 12:44:39 PMJul 14
to MSstats
Hi everyone,

I am trying to analyze report data from DIANN 2.2 and I am not sure whether the diann import function in Msstats is already supporting the --export-data file format introduced in DIANN 2.X and discussed in this thread. Are there any updates on this?

Eduard

El dia dimarts, 13 de maig del 2025 a les 19:35:44 UTC+2, antho...@gmail.com va escriure:

Anthony Wu

unread,
Jul 15, 2025, 4:34:10 PMJul 15
to MSstats
Hi,

Thank you Eduard for providing a sample dataset to work with and Anatoly for showing your workaround.  I published some code that incorporates the workaround and I'll look to push to the current release of bioconductor by end of next week.  It should handle the new format if you specify quantificationColumn = 'auto'  in the DIANNtoMSstatsFormat function.

Tony

Anthony Wu

unread,
Jul 29, 2025, 11:03:59 AMJul 29
to MSstats
Changes to support DIANN 2.0 format should now be on bioconductor release 3.21
Reply all
Reply to author
Forward
0 new messages