SNPs

27 views
Skip to first unread message

Iván V

unread,
Apr 17, 2021, 12:45:39 AM4/17/21
to nemo-simul


Hi all,

I am trying to run an analysis using SNPS available on DRYAD and although I was able to run a simple test (resize_patch_capacity at generation 2) I am afraid there is something wrong going on. I would like to know if is possible to get some help to solve these two questions:

 

1) The set of SNPs I am working with has a lot of triallelic SNPs (00, 11, 12, 13, 21, 22, 23, 31, 32, 33). Is it OK to run Nemo with triallelic SNPs or I should remove those alleles and use biallelic SNPs only?

 

2) The only way I found to run my analysis was setting the number of alleles as 33 alleles since anytime I tried to run it with 3 alleles I received a message from Nemo saying "allele value 33 is greater than 3". The same happened when using biallelic SNPs only (" allele value 22 is greater than 2). What should I do to fix this?

 


Thanks!

Frederic Guillaume

unread,
Apr 19, 2021, 9:15:28 AM4/19/21
to nemo-simul
Hi Ivan

yes, it is possible to run simulations with 3 alleles per locus, but for the 'ntrl' trait only

for your second question, I don't know when this error message is issued, I could'nt find it in the code (is something missing in your report?). I am guessing that you are trying to read the SNP data from a file formated as an FSTAT file. It might be possible that you have mis-specified the number of digits used to encode the allele values in the genotypes. In your case this should be set to 1, which is the last number of the first line of an FSTAT file. Then, Nemo can correctly parse the compacted genotypes into two different allele values (33 -> 3 & 3). 

I hope I've correctly guessed the source of the error. If not, please send an example of your parameter and input files.

best
Fred

Iván V

unread,
Apr 19, 2021, 2:53:39 PM4/19/21
to nemo-simul

Hi Fred,

 

Thanks for the help. I just made the changes you suggested in your response and it worked but when I look at the .dat files from the fst folder they look a little bit strange.  Although I have three alleles (01, 02, and 03) there are a lot of alleles 256 as well. I am afraid this must be related with the missing allele values from the original FSTAT file I am working with but I am not sure whether this is correct or not. I tried to run the analyses with three different FSTAT files (using missing values 0, 00, and 0000) but in all cases I had the same result.

Here is an example from a .dat file I got from a simulation test:

1 7972 3 2

loc1

loc2

Loc7968

age

sex

ped

origin

1 0303 03256 01256 256256 25602 01256 256256 256256 256256 0202 02256 …..  2 1 1 1

 

And here is an example from the FSTAT I am using as source population:

2             7972      3             1

L0001

L0002

L7968

age

sex

ped

origin

1 00 00 11 33 22 00 00 00 11….     4 1 0 0

 

I was wondering if this 256 allele is right or if I should remove the missing values from the FSTAT file to fix this (or if there is anything else I should do)?

Frederic Guillaume

unread,
Apr 20, 2021, 12:59:57 PM4/20/21
to nemo-simul
Hi Ivan

you are using an older version of nemo (2.3.46?), hence the allele value of 256. This comes from the missing allele value. The latest versions (>2.3.50) do not accept missing values in FSTAT files anymore, you should set it to something else different from 0. Allele values are internally set within the range [0, ntrl_all[ by subtracting 1 (0-based values), which is why an input value of 0 was converted to 256 previously.

That said, there is indeed a mistake in the allele digit value saved in the FSTAT file by nemo for data with max allele < 10 (eg SNP data). The digit value is wrongly set to 2 as you noticed. This will be changed in the next update of nemo.
Thanks for pointing this out!
Reply all
Reply to author
Forward
0 new messages