I'm attempting to print out the full sequence length (including invariant sites) from tree-sequences in FASTA format. We're hoping to use some common phylogenetic methods, and most rely on invariant sites. They can be any symbol (I can just change them in a text file later), but they need to be present. I saw there was some discussion about it here (
https://github.com/tskit-dev/tskit/issues/338), but was unsure if a method was ever implemented for it. To just print the variant haplotypes, I've been using:
haps = []
for i in ts.haplotypes():
haps.append(i)
sequence_IDs = []
for i in range(len(haps)):
sequence_IDs.append(f'sample_{ts.samples()[i]}_pop_{ts.node(i).population}')
with open('ts_mig01.fas', 'w') as f:
for i in range(len(haps)):
f.write(f'>{sequence_IDs[i]}\n{haps[i]}\n')