mass.calculate_mass() with N-term modification

5 views
Skip to first unread message

Hassan Hijazi

unread,
Jun 4, 2024, 6:29:05 AMJun 4
to Pyteomics
Hello,

Why when having N-terminal modification, the mass calculated misses one H?
#Reproducible Example

#```

import pyteomics.mass as mass


variable_mods={'pr-': True, # N-term prop
               'me2': ['K'],
               'bu': ['K'],
               'pr': ['K']
              }

modifications = {
    'pr-': 'Propionyl',
    'me2': 'Dimethyl',
    'bu': 'Butyryl',
    'pr': 'Propionyl'
}



aa_comp = dict(mass.std_aa_comp)

db = mass.Unimod()

for mod, mod_name in modifications.items():
    aa_comp[mod] = db.by_title(mod_name)['composition']

mass.calculate_mass(sequence = 'pr-buKSAPATGGVme2KprKPHR', aa_comp=aa_comp, charge=0, ion_type='M', max_mods=None, )

#```
#>>> mass.calculate_mass(sequence = 'pr-buKSAPATGGVme2KprKPHR', aa_comp=aa_comp, charge=0, ion_type='M', max_mods=None)
#1641.94404950555

#Whereas the mass must be: 1642.951859

#Where am I mistaken?

Lev Levitsky

unread,
Jun 4, 2024, 9:24:16 AMJun 4
to pyte...@googlegroups.com, Hassan Hijazi
Hi Hassan,

I have a question, how did you get the expected result of 1642.95?
I can see that I get this value if I calculate the mass of the unmodified peptide and then add all modifications:

In [11]: mass.calculate_mass(sequence='KSAPATGGVKKPHR') + db.by_title('Propionyl')['mono_mass'] * 2 + db.by_title('Dimethyl')['mono_mass'] + db.by_title('Butyryl')['mono_mass']
Out[11]: 1642.9518751016801

However, this should not equal the mass of a modified peptide with the same set of modifications, because the result above also includes N-terminal "H-". Hence this mass is higher by one hydrogen than the calculated mass of the modified peptide, which seems correct to me.

Please feel free to correct me or add more information if needed.

Best regards,
Lev


--
You received this message because you are subscribed to the Google Groups "Pyteomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyteomics+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pyteomics/3d5bc506-a8eb-4204-9d40-ea24087f04c5n%40googlegroups.com.

Lev Levitsky

unread,
Jun 4, 2024, 11:18:21 AMJun 4
to Hassan Hijazi, pyte...@googlegroups.com
Hi Hassan,

Thanks for the clarification. I think you're right and I just intuitively used terminal and regular mods differently with these calculations, supplying the full composition for terminal groups and a hydrogen-subtracted one (like in Unimod) for regular modifications. This is what seems to produce the correct results. In other words, the syntax we use implies alternative terminal groups, not terminal modifications.

I can see how this can easily lead to errors like the one you showed. We should think about what we can do on the Pyteomics side to avoid confusion. For example, if we use aa_comp keys without hyphens but they are used as terminal groups, we can subtract the H or OH automatically. But otherwise I think we should just improve documentation to explain the difference between a terminal group and a modification in this context.

The workaround for you now would just be to add the missing hydrogen to aa_comp['pr-'] relative to aa_comp['pr'].

Sorry for the confusion and I hope this helps.
Best regards,

Lev

On Tue, Jun 4, 2024 at 4:24 PM Hassan Hijazi <hejazi...@gmail.com> wrote:
Hi Lev,

OK here's where the discrepancy is occurring. 
The Unimod modifications (at least for the ones I use [lysine acylations] are reported with one hydrogen subtracted).

For example, for acetyl:
H(2) C(2) O this the unimod composition whereas its structure is : 
image.png with H(3). 

The way you calculated it above here should be the way to go for these modifications. 

To answer your first question, I got this mass from Mascot. 

Please correct me if I'm wrong.

Thank you.
Reply all
Reply to author
Forward
0 new messages