Problem converting STACKS .fa to migrate

Diego Peralta

unread,

Aug 12, 2021, 3:23:52 PM8/12/21

to migrate-support

Hi everybody,

I'm trying to convert populations.sample.fa from STACKS to migrate by using fasta2genotype.py and stacks2mig.py but both options give me error messages. My STACKS running gives me near 70k SNPs but I decided to reduce the number after reading several posts in this group. I could upload the first lines of my input files if you needed.

So, for the first script, I used 5500 SNPs (file populations2.samples.fa) and 32 individuals (1 snp/locus and non-missing data among individuals), but after choosing the different options in the script, it stops with the next error:

/dmperalta/migrate_files/1.1snp# python2.7 fasta2genotype.py populations2.samples.fa migrate.whiteloci migrate.populations populations.snps.vcf infile1

Output type? [1] Migrate [2] Arlequin [3] DIYABC [4] LFMM [5] Phylip [6] G-Phocs [7] Treemix [8] Haplotype: 1

Loci to use? [1] Variable [2] All: 2

Coverage Cutoff (number reads for locus)? Use '0' to ignore coverage: 0

Filter for allele frequency? False alleles might bias data. [1] Yes [2] No: 2

Remove monomorphic loci? [1] Yes [2] No: 2

Filter for missing genotypes? These might bias data. [1] Yes [2] No: 2

Clip cut sites? These may bias data. [1] Yes [2] No: 2

Cataloging populations...

Cataloging loci...

Counting gene copies...

Counting locus lengths...

Traceback (most recent call last):

File "fasta2genotype.py", line 620, in <module>

num_sites = SeqSitesCount(sys.argv[1], cutsite1, cutsite2, checkbothends)

File "fasta2genotype.py", line 594, in SeqSitesCount

nextline = newfasta.next(); nextline = nextline.strip()

StopIteration

With the second script and using the the same fasta file, but with the entire title in the headers (file populations.sample.fa), I obtain the next error:

/dmperalta/migrate_stacks# cat populations.samples.fa | python2.7 stacks2mig.py

Traceback (most recent call last):

File "stacks2mig.py", line 170, in <module>

headers,sequences = read_stacks(infilename)

File "stacks2mig.py", line 63, in read_stacks

read_lines(sys.stdin,headers,sequences)

File "stacks2mig.py", line 45, in read_lines

index = h.index('>CLocus')

ValueError: '>CLocus' is not in list

Can someone help me with this? I was also trying to use the new vcf2migrate.py script, but it also gave me error messages

Diego Peralta

unread,

Sep 22, 2021, 12:48:58 PM9/22/21

to migrate-support

Hi everyone again,

After one month of acquiring experience with the program and trying with different types of data files, I was able to convert my STACKS (2.59) fasta files to migrate using the converter fasta2genotype.py write by Paul Maier.

I read several questions trying to convert fasta files to migrate so I thought to share my experience. It could be useful to say that I am new at bioinformatics so maybe someone could do that quickly and better.

The converter needs a fasta and a vcf file from the same running, and a population file, which contains the number of locus assigned by gstacks or denovo_map.pl modules, the name of the sample and the name of the population, order as the popmap used in populations. I use the file populations.samples.fa produced by the flag --fasta-samples of populations and the populations.snps.vcf of the same running.

First I converted the fasta file to the format of STACKS 1.12. The new fasta format has the names of the sample and chromosome in the header of each locus. I used this to eliminate them:

sed -i 's/_0 .*/_0/' populations.samples.fa

sed -i 's/_1 .*/_1/' populations.samples.fa

After that I checked the .fa file to search for the number assigned for my samples and use it to write the population file. The format needs to be like that:

SampleID IndID PopID

5 Q06B A_Uru

1 PQ03 A_Uru

3 Q03 A_Uru

9 RN4 B_PatN

10 RN7 B_PatN

8 PV22 B_PatN

After this easy steps the files are ready for being convert to the migrate format. I hope that could be useful for someone :).

Best to all,

Diego

Quinn Carvey

unread,

Feb 22, 2023, 12:21:52 PM2/22/23

to migrate-support

Hi Diego,

Just wanted to say thank you for posting your solution, it's useful years later!