MSStats+ temporal run and Spectronaut input recommendations

13 views

Skip to first unread message

Janice Lee

unread,

May 29, 2026, 5:19:51 PMMay 29

to MSstats

Hi MSStats,

I'm here with more MSStats+ questions! First, thanks for putting an sample dataset I can run through here (https://bioconductor.org/packages//release/bioc/vignettes/MSstats/inst/doc/MSstatsPlus.html). It was helpful to see the two options to "convert" spectronaut output to msstats ready format and to include anomaly model features. I am going to try this with a dataset I have now. This dataset has ~96 samples.

First of all, I erred on the side of caution and downloaded a Spectronaut report with the ~60 columns in the example dataset "spectronaut_raw". This file is ~32GB. I understand that there are 3 key anomaly features that should be there ("FG.ShapeQualityScore (MS1)","FG.ShapeQualityScore (MS2)", "EG.DeltaRT"). Hence, would you recommend that I download a Spectronaut report with all required columns for MSStats and add the 3 additional columns? Additionally but perhaps tangential, is this something MSStatsBig could help with in anyway?

Secondly, the R. Run Date(Formatted) column in my dataset has the same time stamp for every sample (due to the way our data is transferred from instruments and archives and finally to our designated search computers). Could I possibly create a "run_order" data frame with 2 columns, one with file names and one with order that I will manually enter and use that in the "alternative/"non-auto" run converter route?

Thanks in advance and sorry if these questions were already raised before!

Janice

Anthony Wu

unread,

Jun 4, 2026, 7:10:50 PMJun 4

to MSstats

Hi Janice,

Here's my answers to the following questions:

> would you recommend that I download a Spectronaut report with all required columns for MSStats and add the 3 additional columns?

The Spectronaut report in the example vignette (i.e. spectronaut_quality_input.csv) contains more columns than actually needed. If you're able to select columns from Spectronaut, I would recommend selecting the following columns (along with the 3 additional columns)

PG.ProteinGroups or PG.ProteinAccessions (this column is used to set ProteinName)

EG.ModifiedSequence

FG.Charge

F.FrgIon

F.Charge

R.FileName

R.Condition and R.Replicate (if you already annotated your experimental design on spectronaut)

R.Fraction (if you have fractions)

F.PeakArea

F.FrgLossType

EG.Qvalue

PG.Qvalue

F.PossibleInterference

F.ExcludedFromQuantification

> Additionally but perhaps tangential, is this something MSStatsBig could help with in anyway?

Yes, if Spectronaut forces you to keep more columns than necessary, MSstatsBig can help by removing the unneeded columns prior to processing, reducing the size of your dataset. Although to also go on a tangent, the main benefit of MSstatsBig is reducing the size of your dataset by selecting top N fragment ions for processing (rather than all fragment ions for processing) - this is particularly useful if your DIA dataset can't fit in RAM / if you feel that low abundant fragment ions are noisy and prone to large technical error.

> Could I possibly create a "run_order" data frame with 2 columns, one with file names and one with order that I will manually enter and use that in the "alternative/"non-auto" run converter route?

Yes, you can create a data frame with 2 columns, Run (i.e. your file names), and Order (i.e. the run order that you manually enter). Then you can set `runOrder` as that data frame.

Thanks,

Tony

Reply all

Reply to author

Forward

0 new messages