!! Important updates of all MSstats family packages in Bioconductor 3.13

261 views
Skip to first unread message

Meena Choi

unread,
May 24, 2021, 6:46:57 PM5/24/21
to MSstats
Dear MSstats users,

Bioconductor 3.13 is released. It includes the big version upgrades of MSstats v4.0, MSstatsTMT v2.0, MSstatsPTM v1.2, MSstatsConverter v1.2, and MSstatsLOBD. v1.0

Major updates are
- More exported functions to make the workflow more modular
- Performance updates (20-40% faster)
- Removed some dependencies
- New logging system
- !! Outputs of some functions are changed, in order to make them consistent across MSstats family packages. 

Here is the summary of all the important changes. If you use any of them in your analysis and pipeline, please check and update.

To install updates,
1. R 4.1 is required.
2. Make sure that you can install Bioconductor 3.13
BiocManager::install(version="3.13")
3. install MSstats family packages
BiocManager::install(c("MSstats", "MSstatsTMT", "MSstatsPTM"))

Hope that you enjoy the updates!

- MSstats Team

Veronique Storme

unread,
May 25, 2021, 3:55:05 AM5/25/21
to MSstats
Dear  MSstats Team,

with the new version I now get an error with:

> raw <- SkylinetoMSstatsFormat(input)

"Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__,  : 
  Join results in 98028 rows; more than 53319 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice."

I checked for duplicate rows with:

> input.distinct <- input %>% distinct()

No duplicate rows were found. What could be the problem here?

Thanks,
Veronique

Mateusz Staniak

unread,
May 25, 2021, 4:26:15 AM5/25/21
to MSstats
Hi Veronique,


what is your annotation for this dataset?



Mateusz

Veronique Storme

unread,
May 25, 2021, 5:32:25 AM5/25/21
to MSstats
Dear Mateusz,

This is how the data looks like:

head(input)
         ProteinName PeptideModifiedSequence PrecursorCharge FragmentIon ProductCharge IsotopeLabelType
1 AT2G37690.1.Cys290    DGSMVC[+57]YPVVETIHR               2         y10             1            light
2 AT2G37690.1.Cys290    DGSMVC[+57]YPVVETIHR               2         y10             1            light
3 AT2G37690.1.Cys290    DGSMVC[+57]YPVVETIHR               2         y10             1            light
4 AT2G37690.1.Cys290    DGSMVC[+57]YPVVETIHR               2         y10             1            light
5 AT2G37690.1.Cys290    DGSMVC[+57]YPVVETIHR               2         y10             1            light
6 AT2G37690.1.Cys290    DGSMVC[+57]YPVVETIHR               2         y10             1            light
  Condition BioReplicate                                              FileName     Area StandardType
1      ctrl            1 B08535_Ap_WNE3_trap2_CMB-753_KRGEV-Patrick_DIA-1.mzML 257786.6           NA
2       0.1            1 B08537_Ap_WNE3_trap2_CMB-753_KRGEV-Patrick_DIA-4.mzML 126372.8           NA
3       0.5            1 B08539_Ap_WNE3_trap2_CMB-753_KRGEV-Patrick_DIA-7.mzML 329678.6           NA
4      ctrl            2 B08541_Ap_WNE3_trap2_CMB-753_KRGEV-Patrick_DIA-2.mzML 105091.5           NA
5       0.1            2 B08543_Ap_WNE3_trap2_CMB-753_KRGEV-Patrick_DIA-5.mzML 228964.3           NA
6       0.5            2 B08545_Ap_WNE3_trap2_CMB-753_KRGEV-Patrick_DIA-8.mzML 299007.1           NA
  Truncated DetectionQValue
1     False    2.800386e-05
2     False    7.454621e-05
3     False    5.508858e-07
4     False    2.926904e-03
5     False    1.177298e-06
6     False    2.265132e-07

Mateusz Staniak

unread,
May 25, 2021, 5:48:47 AM5/25/21
to MSstats
Hi,


can you share all the data with me? I'll see what the problem is. If they're not public, you can just send them to me via e-mail.



Mateusz

Veronique Storme

unread,
May 25, 2021, 6:17:26 AM5/25/21
to MSstats
I'd prefer not to as this is not my data

Mateusz Staniak

unread,
May 25, 2021, 7:23:33 AM5/25/21
to MSstats
Hi,


thank you for your help,
it was a result of an unfortunate bug that affects SRM inputs. It will be fixed on Bioconductor today



Kind regards
Mateusz

Miguel Cosenza

unread,
May 26, 2021, 9:28:43 AM5/26/21
to MSstats
Hello MSstats team,

Thanks for the hard work and congrats on this new release!

I wanted to ask: is there a way to parallelize the `dataProcess` function in MSstats? The cluster option was very useful to increase the speed of the analyses but it is not there in the new version.

Best wishes,
Miguel

Mateusz Staniak

unread,
May 26, 2021, 9:46:36 AM5/26/21
to MSstats
Dear Miguel,



looking at the examples for MSstatsSummarizationOutput, you will find the full dataProcess workflow in the new interface. In this workflow, you can replace the call to MSstatsSummarize with a custom parallelized loop over calls to MSstatsSummarizeSingleTMP (or ...Linear). [Example for the MSstatsSummarizeSingleTMP function will be helpful, too].
Please let us know if you have any more questions.
We will certainly expands this topic in online documentation.


Kind regards
Mateusz

Marc van Oostrum

unread,
May 28, 2021, 7:21:31 AM5/28/21
to MSstats
Dear MSstats Team

Thanks for the update, looks great!
Do you have a more comprehensive list of changes than what's listed in the link above?
I experienced some issues upgrading from MSstats 3.18 that were related to;
  • Output of dataProcess: FeatureLevelData$GROUP_ORIGINAL doesn't exist anymore
  • Output of GroupComparison: ComparisonResult$Label is class char, it used to be factor
  • If you apply a FCcutoff to the volcano plots the coloring is off
Best regards
Marc



Mateusz Staniak

unread,
May 30, 2021, 7:27:44 PM5/30/21
to MSstats
Hi,




- we will look at the coloring issue,
- the contents of GROUP_ORIGINAL column are now in the GROUP column,
- it turned out that the issue with Skyline dataset was related to a different previously unnoticed (and hard to find) bug, which I fixed. The updated will be pushed to Bioconductor on Monday,
- we will definitely prepare more materials describing the new version and differences vs previous version soon.

Kind regards
Mateusz

Mateusz Staniak

unread,
Jun 3, 2021, 4:12:37 PM6/3/21
to MSstats
Dear Veronique,



Bioconductor updates take some time, but version 4.0.1 is now available and it should fix your problem. Both MSstats and MSstatsConvert require an update.



Kind regards
Mateusz
Reply all
Reply to author
Forward
0 new messages