GERP scores for HGDP SNPs

201 views
Skip to first unread message

isabelle...@iee.unibe.ch

unread,
Feb 7, 2013, 11:31:14 AM2/7/13
to gen...@soe.ucsc.edu

Good afternoon,

 

I would need to retrieve the GERP score for the HGDP SNPs (which correspond to a bit less than 700'000 SNPs positions in the human genome, hg19 coordinates).

I could get partial results from the genome website (<1000 loci).

I checked the mailing list and downloaded the All_hg19_RS.bw file.

But it does not seem to contain what I need (it seems to be a binary file).

 

Could you please suggest me a solution to this problem ?

 

Thank you very much in advance and thank you for taking care of this very nice genomic resource.

Best, Isabelle

 

 

-----------------------------------------
Dr. Isabelle Dupanloup Duperret
CMPG - University of Bern
Baltzerstrasse 6 - CH-3012 Bern
Tel. +41(0)316314549 - Fax +41(0)316314888

 

Pauline Fujita

unread,
Feb 7, 2013, 8:14:42 PM2/7/13
to isabelle...@iee.unibe.ch, gen...@soe.ucsc.edu
Hello Isabelle,

If you output the HGDP coordinates as a BED formatted file you could
use one of our utilities to extract data for these coordinates from
the bigWig file you reference (All_hg19_RS.bw).

1. To output HGDP as BED, go to the Table Browser here:
http://www.genome.ucsc.edu/cgi-bin/hgTables

Once there select:

group: Variation and Repeats
track: HGDP Allele Freq
table: hgdpGeo
region: genome

output format: BED
output file: [put a name in here so that it outputs a file (rather
than displays the results in a browser window)] we'll call the file
"HGDP.bed" for the example

then click "get output".

2. You will need to download and install the kent source command line
utilities. You can find instructions on how to do this here:

http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads

once you have these running - you can run the program
bigWigAverageOverBed without any arguments to see its usage statement:

$ bigWigAverageOverBed
bigWigAverageOverBed - Compute average score of big wig over each bed,
which may have introns.
usage:
bigWigAverageOverBed in.bw in.bed out.tab
The output columns are:
name - name field from bed, which should be unique
size - size of bed (sum of exon sizes
covered - # bases within exons covered by bigWig
sum - sum of values over all bases covered
mean0 - average over bases with non-covered bases counting as zeroes
mean - average over just covered bases
Options:
-bedOut=out.bed - Make output bed that is echo of input bed but
with mean column appended
-sampleAroundCenter=N - Take sample at region N bases wide centered
around bed item, rather
than the usual sample in the bed item.


3. You will want to run something like the following:

bigWigAverageOverBed All_hg19_RS.bw HGDP.bed out.tab


Hopefully this is enough to get you started. If you have subsequent
questions please feel free to contact the mailing list again at
gen...@soe.ucsc.edu.

Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
> --
>
>
>
Reply all
Reply to author
Forward
0 new messages