Problem converting STACKS .fa to migrate

168 views
Skip to first unread message

Diego Peralta

unread,
Aug 12, 2021, 3:23:52 PM8/12/21
to migrate-support
Hi everybody,

I'm trying to convert populations.sample.fa from STACKS to migrate by using fasta2genotype.py and stacks2mig.py but both options give me error messages. My STACKS running gives me near 70k SNPs but I decided to reduce the number after reading several posts in this group. I could upload the first lines of my input files if you needed. 

So, for the first script, I used 5500 SNPs (file populations2.samples.fa) and 32 individuals (1 snp/locus and non-missing data among individuals), but after choosing the different options in the script, it stops with the next error:

/dmperalta/migrate_files/1.1snp# python2.7 fasta2genotype.py populations2.samples.fa migrate.whiteloci migrate.populations populations.snps.vcf infile1
Output type? [1] Migrate [2] Arlequin [3] DIYABC [4] LFMM [5] Phylip [6] G-Phocs [7] Treemix [8] Haplotype: 1
Loci to use? [1] Variable [2] All: 2
Coverage Cutoff (number reads for locus)? Use '0' to ignore coverage: 0
Filter for allele frequency? False alleles might bias data. [1] Yes [2] No: 2
Remove monomorphic loci? [1] Yes [2] No: 2
Filter for missing genotypes? These might bias data. [1] Yes [2] No: 2
Clip cut sites? These may bias data. [1] Yes [2] No: 2
Cataloging populations...
Cataloging loci...
Counting gene copies...
Counting locus lengths...
Traceback (most recent call last):
  File "fasta2genotype.py", line 620, in <module>
    num_sites = SeqSitesCount(sys.argv[1], cutsite1, cutsite2, checkbothends)
  File "fasta2genotype.py", line 594, in SeqSitesCount
    nextline = newfasta.next(); nextline = nextline.strip()
StopIteration

With the second script and using the the same fasta file, but with the entire title in the headers (file populations.sample.fa), I obtain the next error:

/dmperalta/migrate_stacks# cat populations.samples.fa | python2.7 stacks2mig.py
Traceback (most recent call last):
  File "stacks2mig.py", line 170, in <module>
    headers,sequences = read_stacks(infilename)
  File "stacks2mig.py", line 63, in read_stacks
    read_lines(sys.stdin,headers,sequences)
  File "stacks2mig.py", line 45, in read_lines
    index = h.index('>CLocus')
ValueError: '>CLocus' is not in list

Can someone help me with this? I was also trying to use the new vcf2migrate.py script, but it also gave me error messages

Diego Peralta

unread,
Sep 22, 2021, 12:48:58 PM9/22/21
to migrate-support
Hi everyone again, 

After one month of acquiring experience with the program and trying with different types of data files, I was able to convert my STACKS (2.59) fasta files to migrate using the converter fasta2genotype.py write by Paul Maier. 
I read several questions trying to convert fasta files to migrate so I thought to share my experience. It could be useful to say that I am new at bioinformatics so maybe someone could do that quickly and better. 

The converter needs a fasta and a vcf file from the same running, and a population file, which contains the number of locus assigned by gstacks or denovo_map.pl modules, the name of the sample and the name of the population, order as the popmap used in populations. I use the file populations.samples.fa produced by the flag --fasta-samples of populations and the populations.snps.vcf of the same running. 

First I converted the fasta file to the format of STACKS 1.12. The new fasta format has the names of the sample and chromosome in the header of each locus. I used this to eliminate them:

sed -i 's/_0 .*/_0/' populations.samples.fa
sed -i 's/_1 .*/_1/' populations.samples.fa

After that I checked the .fa file to search for the number assigned for my samples and use it to write the population file. The format needs to be like that:

SampleID        IndID   PopID
5       Q06B    A_Uru
1       PQ03    A_Uru
3       Q03     A_Uru
9       RN4     B_PatN
10      RN7     B_PatN
8       PV22    B_PatN

After this easy steps the files are ready for being convert to the migrate format. I hope that could be useful for someone :).

Best to all,
Diego 

Quinn Carvey

unread,
Feb 22, 2023, 12:21:52 PM2/22/23
to migrate-support
Hi Diego,

Just wanted to say thank you for posting your solution, it's useful years later!

Cheers,
Quinn

Reply all
Reply to author
Forward
0 new messages