Hello Isabelle,
If you output the HGDP coordinates as a BED formatted file you could
use one of our utilities to extract data for these coordinates from
the bigWig file you reference (All_hg19_RS.bw).
1. To output HGDP as BED, go to the Table Browser here:
http://www.genome.ucsc.edu/cgi-bin/hgTables
Once there select:
group: Variation and Repeats
track: HGDP Allele Freq
table: hgdpGeo
region: genome
output format: BED
output file: [put a name in here so that it outputs a file (rather
than displays the results in a browser window)] we'll call the file
"HGDP.bed" for the example
then click "get output".
2. You will need to download and install the kent source command line
utilities. You can find instructions on how to do this here:
http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads
once you have these running - you can run the program
bigWigAverageOverBed without any arguments to see its usage statement:
$ bigWigAverageOverBed
bigWigAverageOverBed - Compute average score of big wig over each bed,
which may have introns.
usage:
bigWigAverageOverBed
in.bw in.bed out.tab
The output columns are:
name - name field from bed, which should be unique
size - size of bed (sum of exon sizes
covered - # bases within exons covered by bigWig
sum - sum of values over all bases covered
mean0 - average over bases with non-covered bases counting as zeroes
mean - average over just covered bases
Options:
-bedOut=out.bed - Make output bed that is echo of input bed but
with mean column appended
-sampleAroundCenter=N - Take sample at region N bases wide centered
around bed item, rather
than the usual sample in the bed item.
3. You will want to run something like the following:
bigWigAverageOverBed All_hg19_RS.bw HGDP.bed out.tab
Hopefully this is enough to get you started. If you have subsequent
questions please feel free to contact the mailing list again at
gen...@soe.ucsc.edu.
Best regards,
Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
> --
>
>
>