potential bug in .countMissingPercentage

23 views
Skip to first unread message

Selim Bouaouina

unread,
Aug 8, 2025, 1:42:42 PMAug 8
to MSstats
Dear MSstats team,

I'm using MSstats ( MSstats_4.14.2) to analyze differentially abundant proteins between multiple conditions.
While trying to understand how MissingPercentage in the MSstats::groupComparison output is calculated, I noticed, that the values in the column do not overlap with the values I get, when I run each function (and .sub-function) manually and in pairwise manner. 

I run groupComparison for a 3x3 comparison-matrix with GROUPS= h_comp, s_comp, w_comp.
Levels of the processedData$ProteinLevelData$GROUP are then in alphabetical order:  h_comp, s_comp, w_comp, by which I fill in the comparison matrix.
However, in the function .countMissingPercentage (within MSstats::groupComparison), a table named "count" is defined:
    counts = summarized[, list(totalN = unique(TotalGroupMeasurements),
        NumMeasuredFeature = sum(NumMeasuredFeature, na.rm = TRUE),
        NumImputedFeature = sum(NumImputedFeature, na.rm = TRUE)),
        by = "GROUP"]

The row names in "count" are then ordered "w_comp, s_comp, h_comp", resulting in the wrong choice of rows in the for-loop below, when you execute "conditions = contrast_matrix[i, ] != 0".

The short term solution is to only conduct comparisons with 2x2 comparison-matrices, like that the mixing up of rows in the for-loop should not result in wrong MissingPercentages. 
I tried with different comparisons and figured this might be a bug in MSstats. Please let me know if you cannot reproduce these wrong MissingPercentages with data of your choice.

Cheers,
Selim

Anthony Wu

unread,
Aug 20, 2025, 9:58:54 AMAug 20
to MSstats
Hi,

Thank you for pointing out this bug.

I've reproduced the issue on my end and can confirm that it occurs when using custom contrast matrices (the exception being when contrast.matrix = "pairwise"). The problem is that rows in the "counts" object are ordered based on the sequence in which groups first appear in the "summarized" table, rather than matching the column order of the contrast matrix.

The fix is to add this line of code to ensure proper alignment:

counts = counts[match(colnames(contrast_matrix), GROUP)]

This will reorder the counts table to match the column names of the contrast matrix, ensuring consistent ordering between the two objects.

I'll include this fix in the upcoming release 3.21.

Thanks again for catching this!

Thanks,
Tony

Selim Bouaouina

unread,
Aug 25, 2025, 4:21:36 AMAug 25
to MSstats
Hi Tony,

Thanks for getting back to this and suggesting a more pragmatic fix!

Best,
Selim

Anthony Wu

unread,
Aug 26, 2025, 2:23:35 PM (13 days ago) Aug 26
to MSstats
Hi,

Fixes should be available in the next release (3.22), but let me know if you need this pushed in the current release and I can work with the MSstats team to make that happen.

Thanks,
Tony

Selim Bouaouina

unread,
Aug 27, 2025, 3:51:25 AM (12 days ago) Aug 27
to MSstats
Hi Tony,

Thank you very much for your proposition, that's much appreciated. Due to a deadline I repeated my analyses split up by 2x2 matrices, so the fix is of no rush.

Thank you,
Selim
Reply all
Reply to author
Forward
0 new messages