The problems of importing data

150 views
Skip to first unread message

Meng-Huan Song

unread,
Aug 6, 2021, 2:23:30 AM8/6/21
to dadi-user
Hi Prof. Gutenkunst,  

I have two populations’ whole genomic resequencing data, and I mapped them to a HIC version reference genome to obtain the vcf file.
Firstly, I tried to use:
dd = dadi.Misc.make_data_dict_vcf (“vcf.file”,”population.txt”) to generate SFS file,
but failed, the ERROR report is that:
“Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/smh/miniconda3/lib/python3.9/site-packages/dadi/Misc.py", line 489, in make_data_dict_vcf
    popinfo_dict = _get_popinfo(popinfo_file)
  File "/home/smh/miniconda3/lib/python3.9/site-packages/dadi/Misc.py", line 692, in _get_popinfo
    sample = cols[sample_col]
IndexError: list index out of range”.
I think it may because for the first scaffold of my reference genome is large than 512MB, the same easySFS can’ work. I wonder that what can I do to solve this problem?

Secondly, I find a R script at “https://github.com/shenglin-liu/vcf2sfs”, and it can give a sfs format data. I tried to use it convert the example file (example/fs_from_data) dadi provided but it can not work. So, I can’t verify if the fs generated are same or similar. Would you please tell me the sfs generated by this script is creditable? 

Sincerely 

Meng-Huan Song

Ryan Gutenkunst

unread,
Aug 6, 2021, 3:46:07 PM8/6/21
to dadi-user
Hello Meng-Huan,

This may be a bug in our VCF parser, or a malformation of your VCF file. Can you share the header and the first few data lines of the VCF?

Best,
Ryan
> --
> You received this message because you are subscribed to the Google Groups "dadi-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/26a1f6a0-a3c3-45d1-b10d-aefcbaf210a3n%40googlegroups.com.

Meng-Huan Song

unread,
Aug 8, 2021, 1:55:04 AM8/8/21
to dadi-user
Hi Ryan,

Thank you for your so quickly response. The header of vcf file is attach as figures.

I hind only the sample information, I wish it may help you to solve the problem I asked.

All the bests,

1.jpg 

2.jpg

Meng-Huan Song


Ryan Gutenkunst

unread,
Aug 8, 2021, 4:11:01 PM8/8/21
to dadi-user
Hello Meng-Huan,

As I said, screenshots are useless for me to help you. If you want help, send a small example file that reproduces the problem.

Best,
Ryan

> On Aug 7, 2021, at 10:55 PM, Meng-Huan Song <songm...@gmail.com> wrote:
>
> Hi Ryan,
>
> Thank you for your so quickly response. The header of vcf file is attach as figures.
>
> I hind only the sample information, I wish it may help you to solve the problem I asked.
>
> All the bests,
>
> <1.jpg>
> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/0de960b7-dabc-4af1-9f36-3e721ea10dban%40googlegroups.com.
> <2.jpg><1.jpg>

Meng-Huan Song

unread,
Aug 8, 2021, 9:39:41 PM8/8/21
to dadi-user
Hello Ryan,

Sorry for my misunderstanding. 

 

I provide the vcf file contained a few lines (I have tested it and it failed for the same reason when compared to whole complete vcf file) and the population information text.

 Please do not hesitate to tell me if you need other information.


Thanks for your help.

 

All the best,

 

Meng-Huan Song

test.vcf.gz
population.txt

Ryan Gutenkunst

unread,
Aug 10, 2021, 7:05:36 PM8/10/21
to dadi...@googlegroups.com
Hello Meng-Huan,

This turns out to be a simple case. Dadi was confused by the blank line at the end of your popinfo file. Remove that blank line, and you’ll be fine. I also changed the dadi source distribution so that blank lines will no longer lead to an error.

Best,
Ryan
> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/9b1ac9be-297c-49bf-95f7-5ee423094a8dn%40googlegroups.com.
> <test.vcf.gz><population.txt>

Meng-Huan Song

unread,
Aug 11, 2021, 10:33:56 AM8/11/21
to dadi-user
Hi Ryan,

Thank you very much for your help!

All the bests,

Meng-Huan Song

Reply all
Reply to author
Forward
0 new messages