omegaPlus input format

195 views
Skip to first unread message

王丽

unread,
Sep 22, 2016, 12:21:22 PM9/22/16
to OmegaPlus
On this website, I found this:

Please use the latest versions of SweeD (>= 3.2.3) and OmegaPlus (>=2.2.4). These versions facilitate the import of the analysis results into R.

But on this website (http://sco.h-its.org/exelixis/web/software/omegaplus/index.html) , it seems the newest version is 2.2.2, could someone guide me to download OmegaPlus > 2.2.4? 

Download the most recent GNU GPL Linux version 2.2.2 here , it includes a bug fix in the VCF file parser that was associated with handling missing data.

it seems there is VCF parser in 2.2.2, but the manual did not indicate it can take VCF file as input. Can OmegaPlus take VCF file as input, just as SweeD?

Thanks!

Li


Nikos Alachiotis

unread,
Sep 23, 2016, 5:26:19 AM9/23/16
to OmegaPlus
Hi Li,

You can get the latest OmegaPlus version (currently 3.0.0) here:
https://github.com/alachins/omegaplus

Regards,
Nikos

--
You received this message because you are subscribed to the Google Groups "OmegaPlus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to omegaplus+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Nikolaos Alachiotis

王丽

unread,
Sep 26, 2016, 11:18:07 AM9/26/16
to omeg...@googlegroups.com
Thanks, Nikos! For the input fasta format, I am wondering about concatenating SNPs into fasta format, or should I include invariable sites between SNPs in the fasta? 

I have very few samples in each population (3-6 individuals) but dense SNPs. Thus I am wondering if the SNPs are not phased, could it affect the power of detecting selective sweep? If I choose to phase the data, I will lose some SNPs which cannot be phased against a big reference panel. Which one do you suggest? unphased with denser SNPs / phased with less SNPs?


Best
Li

--
You received this message because you are subscribed to a topic in the Google Groups "OmegaPlus" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/omegaplus/K0JAwU22exg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to omegaplus+unsubscribe@googlegroups.com.

王丽

unread,
Sep 26, 2016, 12:07:14 PM9/26/16
to omeg...@googlegroups.com
" For the input fasta format, I am wondering about concatenating SNPs into fasta format, or should I include invariable sites between SNPs in the fasta? "

More thinking about the question: if concatenating SNPs into fasta, it will lose the info about physical positions of those SNPs. That will cause problems, as -minWin and -maxWin were defined using physical length in base pair. 

Could you explain more about what "grid" represents? How does it differ from -minWin and -maxWin (those are quite clearly illustrated  in the manual)?

In the forum, I found people also use vcd as input in omegaplus, but I didnot find the info in the manual. Will vcf format be supported in omegaplus?

With best regards
Li

2016-09-23 4:26 GMT-05:00 Nikos Alachiotis <n.alac...@gmail.com>:

--
You received this message because you are subscribed to a topic in the Google Groups "OmegaPlus" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/omegaplus/K0JAwU22exg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to omegaplus+unsubscribe@googlegroups.com.

Pavlos Pavlidis

unread,
Sep 28, 2016, 5:04:13 AM9/28/16
to omeg...@googlegroups.com, 王丽
 Dear Li,
please correct me if I'm wrong:

You have several reads that can be assembled into contigs. So you can concatenate them in one big file (this can be VCF) but you may loose the position information. If this is the case then I'd suggest to avoid concatenation. Then it's better to try to assemble them against a reference genome. If we assume that the SNPs you will loose is a random subset of the SNPs then I guess that everything will be fine.


grid: Grid represents the resolution you want to have. If for example you choose -grid 1000 then you will evaluate the OmegaPlus score at 1000 positions. A good strategy is to use some grid size that will allow you to have 1 evaluation point every 5 kb. If for example, a chromosome is 100Mb (i.e. 100,000,000 bp) then use -grid 20000. More critical is the -minwin. Do not choose something very small. For most of the datasets we have tested -minwin 5000 is fine.

-maxwin is not that critical. Use something like 100000 or 200000.

Please read our latest paper for more info on maxwin and minwin
https://gigascience.biomedcentral.com/articles/10.1186/s13742-016-0114-9

There is no need to use invariable sites.

OmegaPlus and SweeD accept VCF


best
pavlos

Pavlos Pavlidis, PhD
Research C

Foundation for Research and Technology - Hellas
Institute of Computer Science
GR - 711 10, Heraklion, Crete, Greece
Reply all
Reply to author
Forward
0 new messages