hgvs always uses 1-letter internally. It will always parse 1 and 3 letter. The formatting is controlled by the p_3_letter config variable. (See https://hgvs.readthedocs.io/en/stable/modules/config.html for more)
>>> v1 = hp.parse_hgvs_variant("NP_009225.1:p.(Met297Ile)")
>>> str(v1)
'NP_009225.1:p.(Met297Ile)'
# str just calls the format method with global defaults. You can pass explicit config:
>>> v1.format(conf={"p_3_letter": True})
'NP_009225.1:p.(Met297Ile)'
>>> v1.format(conf={"p_3_letter": False})
'NP_009225.1:p.(M297I)'
# Specifying that everywhere gets old. Change the p_3_letter config globally so that str() works
>>> hgvs.global_config.formatting.p_3_letter = False
>>> str(v1)
'NP_009225.1:p.(M297I)'
# Works on more complex variants too
>>> v2 = hp.parse_hgvs_variant("NP_000050.2:p.(Tyr949MetfsTer11)")
>>> str(v2)
'NP_000050.2:p.(Y949Mfs*11)'
# As a bonus, here's how you drop the parens:
>>> v2.posedit.uncertain = False
>>> str(v2)
'NP_000050.2:p.Y949Mfs*11'
Hi Reece, how are you?I'm CCing Marc on the BRCA Exchange team. As part of streamlining our back-end pipeline, we're looking for a method to derive one-letter protein HGVS variant representations, such as deriving M297I from NP_009225.1:p.(Met297Ile). I've been digging through the hgvs library to find functionality for this. For the simple case, I can see how it can be assembled from the posedit fields:>>> vp = hp.parse_hgvs_variant("NP_009225.1:p.(Met297Ile)")
>>> vp.posedit.pos.start.aa
u'M'
>>> vp.posedit.pos.start.base
297
>>> vp.posedit.edit.alt
u'I'
but of course, hgvs nomenclaturegets really complicated really quickly. For example, starting with NP_000050.2:p.(Tyr949MetfsTer11), how does one get to Y949Mfs*11? I've looked through the hgvs library to see if this formatting option is available, but haven't found it yet. Is that functionality available somewhere under hgvs?
If not, do you know of anything that we might use? I looked at mutalyzer, and it doesn't look nearly as complete as hgvs in its nomenclature support.
Thanks!
Melissa