genedb.find_snps module

21 views
Skip to first unread message

Wagner Magalhães

unread,
Aug 27, 2009, 8:53:25 PM8/27/09
to glu-users
Hi everyone,

I'm having trouble trying to run the genedb.find_snps module using the
command line:

glu genedb.find_snps --genedb=Name --includeloci=FILE --upbases=500000
--downbases=500000 –o output

Where:

-- genedb = Name, hapmap chromosome file (downloaded from the Hapmap)
-- includeloci = file, is the first and the last SNP of the gene that
I want sample

Does anyone know what I am doing wrong?

Is also possible use this same module and search for gene names
instead of loci? For example, pick up all SNPs under a gene range plus
both sides regions (up and downbases)?

Thanks.

Wagner

bioinformed

unread,
Aug 27, 2009, 9:23:43 PM8/27/09
to glu-users
Hi Wagner,

The documentation on the genedb modules is a bit sparse. The genedb
parameter cannot be used to point to a HapMap file (or any sort of
genotype file), but rather requires a database that can be found in
the /usr/local/share/genedb directory. If you're running GLU from a
PC or another server let me know -- I'll get you a copy of the
database. There is a sensible default containing a very comprehensive
set of annotations, so most times you won't need to set that option.

I suspect you want to use something like:

echo BRCA1 | glu genedb.find_snps --upbases=500000 --
downbases=500000 -

or create a file with a list of gene symbols, one per line (genes.lst)

glu genedb.find_snps --upbases=500000 --downbases=500000 genes.lst

I'll add a description of the resulting output to the documentation
tomorrow, in case you have any questions. Notice that all SNPs in
dbSNP are part of the output.

Hint: If you want to restrict the list of SNPs to, say, HapMap:

glu ginfo /usr/local/share/hapmap/build27/glu/
hapmap_CEU_17_r27_fwd_nr_b36.lbat --
outputloci=hapmap_chr17_CEU_snps.lst
echo BRCA1 | glu genedb.find_snps --upbases=500000 --
downbases=500000 --includeloci=hapmap_chr17_CEU_snps.lst -

Most of the time you don't much care if the SNP list contains extra
SNPs. None of the glu programs typically case if you specify an
include list that contains extra items.

Hope this helps,
-Kevin

Wagner Magalhães

unread,
Aug 28, 2009, 5:15:39 PM8/28/09
to glu-users
Thanks Kevin,for the help with this module.

Actually, I'm trying to run on my PC, but after have all command lines
working well I'm planning run on LTG server. Could you send me a copy
of the database, for the tests, please? Also, I would like to confirm
if is possible use this line, where I don't specify the chromosome,
and use the list of genes instead only one:

glu ginfo /usr/local/share/hapmap/build27/glu/
hapmap_r27_fwd_nr_b36.lbat --outputloci=hapmap_CEU_snps.lst | glu
genedb.find_snps --upbases=500000 --downbases=500000 genes.lst --
includeloci=hapmap_snps.lst -o test.txt

Another question:

Is possible filter to include only SNPs with MAF<0.05 on the output?

Thanks.

Wagner.
> > Wagner- Hide quoted text -
>
> - Show quoted text -

Wagner Magalhães

unread,
Aug 28, 2009, 5:28:30 PM8/28/09
to glu-users
MAF>0.05, sorry!
> > - Show quoted text -- Hide quoted text -

Jacobs, Kevin (NIH/NCI) [C]

unread,
Aug 31, 2009, 7:32:08 AM8/31/09
to glu-...@googlegroups.com
Hi Wagner,

You can obtain genedb from the LTG server by copying the following file to your local PC:

/usr/local/share/genedb/genedb_ncbi36.3_dbsnp129.db

And specifying the file and location to GLU via the --genedb option or setting the GLU_GENEDB_PATH environment variable to the directory that contains it.

Also, you cannot pipe the output of glu ginfo in the way that you suggest. It requires two distinct commands:

glu ginfo /usr/local/share/hapmap/build27/glu/hapmap_r27_fwd_nr_b36.lbat --outputloci=hapmap_CEU_snps.lst

to obtain a list of all HapMap SNPs. Note that you need to do this only once. Then run

glu genedb.find_snps --upbases=500000 --downbases=500000 genes.lst --includeloci=hapmap_snps.lst -o regions.txt

Applying a filter in MAF is fairly simple, though it depends on what kind of genotype data you are using. E.g., some HapMap populations include children and should not be counted in MAF calculations.

-Kevin
> > --downbases=500000 -o output
Reply all
Reply to author
Forward
0 new messages