Hi Mateusz:
no I did not run a converter. we used PeakView for peptides quantitation and I cannot see a converter function from peakview output to MSstats, unless I have missed it?
therefore i wrote a script myself to convert the wide format to long format.
following your suggestion, I have removed the duplciate rows and ran the dataprocess again protein by protein. but it looks like this particular protein (A0A8V1ABS2) still ran into the same error. below is my code:
library(MSstats) #imports msstats
ms <- read.delim('Norman_jejunum_reformat_20240214.txt',sep = '\t')
head(ms) #shows column names
ms = ms[!duplicated(ms),]
test_list = unique(ms$ProteinName)
for (p in test_list) {
ms1 <- ms[ms$ProteinName == p,]
print(p)
QuantData <- MSstats::dataProcess(ms1)
}
the first few proteins were fine and then the error popped up, and here is the last few rows of the output.
[1] "A0A8V1ABS2"
INFO [2024-02-14 21:55:44] ** Features with one or two measurements across runs are removed.
INFO [2024-02-14 21:55:44] ** Fractionation handled.
INFO [2024-02-14 21:55:44] ** Updated quantification data to make balanced design. Missing values are marked by NA
INFO [2024-02-14 21:55:44] ** Log2 intensities under cutoff = 7.8211 were considered as censored missing values.
INFO [2024-02-14 21:55:44] ** Log2 intensities = NA were considered as censored missing values.
INFO [2024-02-14 21:55:44] ** Use all features that the dataset originally has.
INFO [2024-02-14 21:55:44]
# proteins: 1
# peptides per protein: 59-59
# features per peptide: 4-6
INFO [2024-02-14 21:55:44]
A B
# runs 10 10
# bioreplicates 10 10
INFO [2024-02-14 21:55:44] Some features are completely missing in at least one condition:
ANVPN[Dea]KVIQC[PPa]FAETGQVQK_3_y10_1,
ANVPN[Dea]KVIQC[PPa]FAETGQVQK_3_y12_1,
ANVPN[Dea]KVIQC[PPa]FAETGQVQK_3_y16_2,
ANVPN[Dea]KVIQC[PPa]FAETGQVQK_3_y7_1,
ANVPN[Dea]KVIQC[PPa]FAETGQVQK_3_y8_1 ...
INFO [2024-02-14 21:55:44] The following runs have more than 75% missing values: 11
INFO [2024-02-14 21:55:44] == Start the summarization per subplot...
| | 0%Aggregate function missing, defaulting to 'length'
<simpleError in .Primitive("length")(newABUNDANCE, keep = TRUE): 2 arguments passed to 'length' which requires 1>
INFO [2024-02-14 21:55:54] == Summarization is done.
Error: Elements listed in `by` must be valid column names in x and y
In addition: Warning message:
Input data.table 'y' has no columns.
actually, we have MSstats 2.4 installed in the other computer and we used this file to run in that version and it just worked fine.