newABUNDANCE = NA, despite available ABUNDANCE value

37 views
Skip to first unread message

Christian Schori

unread,
Dec 15, 2022, 11:12:42 AM12/15/22
to MSstats
Dear MSstats-Team

I've recently observed NA's in the newABUNDANCE (dataProcess > FeatureLevelData$newABUNDANCE) column which I don't understand. These NA's were observed independent of the "censored" column and independent of values in INTENSITY/ABUNDANCE. Can you please elaborate on why these newABUNDANCE entries are NA and how this affects the intensity roll-up to ProteinLevelData?

I've uploaded the dataProcess output of a SpectroNaut (v. 16) report here. But I've observed this NA's also in the data from Spectronaut 17, DIA-NN 1.8.1, FragPipe DIA/DDA (v. 18 & 19).

Thank you for looking into this.

Best,
Christian

SessionInfo:
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_CH.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=de_CH.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_CH.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
 [1] gridExtra_2.3   vroom_1.5.7     forcats_0.5.2   stringr_1.5.0   dplyr_1.0.10    purrr_0.3.4     readr_2.1.2     tidyr_1.2.0     tibble_3.1.8    ggplot2_3.3.6   tidyverse_1.3.2 MSstats_4.4.1


Mateusz Staniak

unread,
Dec 20, 2022, 7:22:21 AM12/20/22
to MSstats
Hi,

can I see dataProcess input and parameters? This is most likely related to the procedure we apply after normalization: https://github.com/Vitek-Lab/MSstats/blob/master/R/utils_censored.R which treats very small values (depending on quantiles of distribution of ABUNDANCE) as censored missing values, but I want to double check. Transformation that already happened in dataProcess make it unclear for me. Thank you for reporting the possible issue and providing the data


Kind regards
Mateusz

Christian Schori

unread,
Dec 20, 2022, 11:07:51 AM12/20/22
to MSstats
Hi Mateusz

Thank you for looking into this issue. I've actually only used default settings for the whole process... (but I see the NA's in newABUNDANCE also, if I'm only processing the top100 featuresubset.

library(tidyverse)
library(MSstats)

SN_report <- read.csv("~/PUMA/Christian/Spectronaut17_benchmark/SN16/20221206_092357_SN16_ec_spikein_Report.csv")
annotation <- read.csv("~/PUMA/Christian/Spectronaut17_benchmark/SN16/annotation.csv")

SNtoMSstats <- SpectronauttoMSstatsFormat(SN_report, annotation)

SN_dataprocess <- dataProcess(SNtoMSstats)

I've just uploaded the original files [Spectronaut report & annotation file (sorry, they're quite big) ] in case you'd like to reproduce the whole process... You can find the files here.

Best,
Christian
Reply all
Reply to author
Forward
0 new messages