Access to FORMAT values of a vcf record

340 views
Skip to first unread message

Tios

unread,
Oct 19, 2021, 2:05:04 AM10/19/21
to Pysam User group
Hi,

I cannot figure out how to retrieve format value from a pysam.VariantRecord object.
VariantRecord.format method only gives access to format metadata, not the actual format values.
I would like to be able to retrieve and edit format values of a variant record.

Thank you.

Marcel Martin

unread,
Oct 19, 2021, 3:46:06 AM10/19/21
to pysam-us...@googlegroups.com
If you wanted to edit "DP", for example, it would look somewhat like this:

vf = VariantFile("file.vcf")
record = next(iter(vf))
record.samples["my_sample_name"]["DP"] = 29

- record.samples is a VariantRecordSamples object
- record.samples["my_sample_name"] is a VariantRecordSample object
- Instead of indexing samples by sample name, I think you can also index
with an int.

If you want to write a format that is not listed in the header, you’ll
get an error, so you need to make sure that you add the appropriate
header line beforehand.

You may want to have a look at how we do this in WhatsHap. Phasing
information (PS or HP tags) is written by a "PhasedVcfWriter" class:
<https://github.com/whatshap/whatshap/blob/main/whatshap/vcf.py#L861>

Regards,
Marcel

Tios

unread,
Oct 21, 2021, 12:35:19 AM10/21/21
to Pysam User group
Thank you so much!

2021년 10월 19일 화요일 오후 4시 46분 6초 UTC+9에 marcel...@scilifelab.se님이 작성:
Reply all
Reply to author
Forward
0 new messages