Add -udcDir support to all bigWig and bigBed related tools.

44 views
Skip to first unread message

Gert Hulselmans

unread,
Aug 21, 2017, 12:48:01 PM8/21/17
to gen...@soe.ucsc.edu
Hi,

It would be nice if all bigWig and bigBed related tools support the -udcDir parameter.

Now only the following tools have this argument:
bigBedInfo
bigBedSummary
bigBedToBed
bigWigInfo
bigWigSummary
bigWigToBedGraph
bigWigToWig


While those tools don't have it:
bigBedNamedItems
bigWigAverageOverBed
bigWigCat
bigWigCluster
bigWigCorrelate
bigWigMerge


At the moment I only need it for bigWigAverageOverBed, but I guess support for the other tools makes sense too.


I guess this support should be easy to add (looking at bigWigInfo):

#include "udc.h"

void usage()
/* Explain usage and exit. */
{
errAbort(
  "bigWigInfo - Print out information about bigWig file.\n"
  "usage:\n"
  "   bigWigInfo file.bw\n"
  "options:\n"
  "   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs\n"
  "   -chroms - list all chromosomes and their sizes\n"
  "   -zooms - list all zoom levels and their sizes\n"
  "   -minMax - list the min and max on a single line\n"
  );
}

static struct optionSpec options[] = {
   {"udcDir", OPTION_STRING},
   {"chroms", OPTION_BOOLEAN},
   {"zooms", OPTION_BOOLEAN},
   {"minMax", OPTION_BOOLEAN},
   {NULL, 0},
};


int main(int argc, char *argv[])
/* Process command line. */
{
optionInit(&argc, argv, options);
udcSetDefaultDir(optionVal("udcDir", udcDefaultDir()));


Sincerely,
Gert

Cath Tyner

unread,
Aug 21, 2017, 6:13:50 PM8/21/17
to Gert Hulselmans, gen...@soe.ucsc.edu
Thank you for contacting the UCSC Genome Browser support team, and thank you for the -udcDir parameter suggestion. Can you try to add a line like this to your ~/.hg.conf to see if it helps?

udc.cacheDir=/path/to/my/udcDir

If that doesn't work for you, please respond to this forum so that we can provide further assistance. 

Thank you for contacting the UCSC Genome Browser support team. 
​Please send new and follow-up questions to one of our UCSC Genome Browser mailing lists below:


  * Post to the Public Help Forum: E
mail 
gen...@soe.ucsc.edu
​ or search the Public Archives
​  * Post to the Mirror Help Forum: Email
 
genome...@soe.ucsc.edu 
or search the Mirror Archives​
​  * Confidential/private help: Email
 
genom...@soe.ucsc.edu

UCSC Genome Browser Announcements List (email alerts for new data & software):
  * Subscribe: Email genome-announce+subscribe...@soe.ucsc.edu 
  * Unsubscribe: Email genome-announce+unsubscri...@soe.ucsc.edu

Join us on Social Media! FacebookTwitter, Wordpress BlogYouTube

​Enjoy,​
Cath
. . .
Cath Tyner
UCSC Genome Browser, Software QA & User Support
UC Santa Cruz Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAF18BxvAQhu_GJw8on9eGV8KPNnRCrccGw_tc7FVUdZoWBpSSg%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Gert Hulselmans

unread,
Aug 22, 2017, 1:14:03 PM8/22/17
to Cath Tyner, gen...@soe.ucsc.edu
Hi Cath,

Setting this in the config file does not seem to matter. It still caches files in /tmp/udcCache

Running bigWigAverageOverBed with strace shows that it does not even try to read ~/.hg.conf:

strace \
    -o bigWigAverageOverBed.strace \
    -s 4096 \
    bigWigAverageOverBed \
        -minMax \
        http://www.epigenomes.ca/data/CEMT/CEMT_37/H3K9me3/A35266.H3K9me3.IX2844.141710.75nt.hg19a.bwa-0.5.7.C4C9JACXX_2_CCCATG.q5.F1028.PET.ucsc.bw \
        regions.bed \
        CEMT0037.gDNA_H3K9me3.bwaob


I also noticed that running bigWigAverageOverBed with this and some other bigWig files results in this error:

processing chromosomes....Interrupted system call
udcDataViaHttpOrFtp: error reading socket


strace output:

read(6, "n\253\34\273\266U\216w[\216s-\307\"\313qy[\315\307\255mu_<\334V\373\340\311\266\332\23oh\24
7\375\361_\355t>y\235\243Nn\27\325\270\261\213zpo\27]\213\337\251\307\305\213]t/L\214\323\16\330'N\2
73a\216\345\30\211S?V\306i\16\36\214\323|\214\217\327}1%^\3734N\210:\373\22T\327$\372\337R\17\325\37
0]\17\365c\253\236\312\7\367\212:_\365R\215u\275\324\203S\256\323\275p\307u\312\361\347\353\324\177<
)\352\234JR\215G{\253\7\347'+_30\352T\17T\215\3R\324\203n\212\346\343\260\24=#f\245\350\331\3619\313
q
...
77\374\353<\275k\234b9>7O\357\v\27\317\323{\304%\226\343+\226c\311<\275w<`9~k\375\330\256@\367\302\353\vt/L)\320n\330\277@;\343\35\5z\26\f\25\350\31qD\201\236\35\37.\320\231\340c\5:+|R\367rp\275\3468xT=\241\362E\336\277\343^T\375\345\252H`\371;\252k\327F\2\27\326\252\306\306\265\272\26\3\3534\0233\326\351^\370\311:\355\200',\307\377\263~\354\277^s\260p\275\346_\371\241\367\357"..., 207188) = 7300
read(6, 0x10433500, 199888)             = -1 EINTR (Interrupted system call)
--- SIGWINCH (Window changed) @ 0 (0) ---
write(2, "Interrupted system call\nudcDataViaHttpOrFtp: error reading socket", 65) = 65
write(2, "\n", 1)                       = 1
write(5, "chr1-reg3\t1000\t766\t26028\t26.028\t33.9791\t0\t57\nchr1-reg5\t1000\t887\t4505\t4.505\t5.07892\t0\t10\nchr1-reg7\t1000\t0\t0\t0\t0\t0\t0\nchr1-reg9\t1136\t376\t376\t0.330986\t1\t0\t1\nchr1-reg11\t1000\t0\t0\t0\t0\t0\t0\nchr1-reg13\t1000\t0\t0\t0\t0\t0\t0\nchr1-reg15\t750\t21\t21\t0.028\t1\t0\t1\nchr1-reg17\t1000\t229\t593\t0.593\t2.58952\t0\t3\nchr1-reg19\t1000\t0\t0\t0\t0\t0\t0\nchr1-reg21\t1000\t426\t1965\t1.965\t4.61268\t0\t8\nchr1-reg23\t1000\t471\t1768\t1.768\t3.75372\t0\t7\n
...


Would it be possible to retry interrupted system calls before giving up?

Many Thanks,
Gert

Christopher Lee

unread,
Aug 25, 2017, 1:24:10 PM8/25/17
to Gert Hulselmans, Cath Tyner, gen...@soe.ucsc.edu
Hi Gert,

Thank you for your question about setting the udcCache. We have
created a ticket to add -udcDir support to the tools you mentioned and
will let you know once it has been completed.

In regards to your uninterrupted system calls question, have you been
seeing a lot of interrupted system calls while running our utils?
Also, we are curious as to whether the SIGWINCH came from resizing the
terminal window while running the utils in a pipe or from something
else.

Thank you again for your inquiry and using the UCSC Genome Browser. If
you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a
publicly-accessible forum. If your question includes sensitive data,
you may send it instead to genom...@soe.ucsc.edu.

Christopher Lee
UCSC Genomics Institute
>> * Subscribe: Email genome-annou...@soe.ucsc.edu
>> * Unsubscribe: Email genome-announ...@soe.ucsc.edu
> https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAF18BxtLia0ayN2ATS_KaiCH0nbEkovX10Aoo5PTt-J8HL6PsA%40mail.gmail.com.

Gert Hulselmans

unread,
Aug 25, 2017, 6:02:59 PM8/25/17
to Christopher Lee, Cath Tyner, gen...@soe.ucsc.edu
Hi Christopher,

Thanks for considering adding udcDir support to tools that miss it now.


I think it was the first time I saw this problem with Kent tools (but repeatedly).
I don't remember that I resized my terminal.
I just ran the command as given in my mail.
At the moment, I can't reproduce this issue.


I now face another issue where bigWigAverageOverBed is unable to fetch a certain number of bytes from a specific host (http://genboree.org).
When repeating the same command over and over, it finally succeeds to get all needed bytes.
Would it be possible to add an option to retry requests if they fail, instead of giving up immediately?

=================================
exit_code=255;

# Repeat until successful.
while [ ${exit_code} -ne 0 ] ; do

    exit_code=$(echo $?);

    echo;
done
=================================
processing chromosomes...unable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/AN%3AH3K27me3%2093/bigWig?gbKey=1tvb0fa2 @107454464 (got 0 bytes)

processing chromosomes....unable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/AN%3AH3K27me3%2093/bigWig?gbKey=1tvb0fa2 @117760000 (got 0 bytes)

processing chromosomes.....unable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/AN%3AH3K27me3%2093/bigWig?gbKey=1tvb0fa2 @127533056 (got 0 bytes)

processing chromosomes......................unable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/AN%3AH3K27me3%2093/bigWig?gbKey=1tvb0fa2 @159522816 (got 0 bytes)

processing chromosomes........................
=================================






A small list of the commands (I have 1243 bigWigAverageOverBed commands for this specific host, so if you need more of them for testing, let me know):

bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/PI%3AH3K36me3%2013%2073/bigWig?gbKey=1tvb0fa2 regions.bed SRS415768_ZGI_213_768_H3K36me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/MS%3AH3K4me3%202/bigWig?gbKey=1tvb0fa2 regions.bed SRS120916_hSKM-2_766_H3K4me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/MSCDA%3AH3K27me3%2092%2037/bigWig?gbKey=1tvb0fa2 regions.bed SRS255271_92_762_H3K27me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/FB%3AH3K4me1%2076%2084/bigWig?gbKey=1tvb0fa2 regions.bed SRS188622_UW_H22676_645_H3K4me1.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/CCIPsMpT%3AH3K27ac%2062%2016/bigWig?gbKey=1tvb0fa2 regions.bed SRS255280_62_542_H3K27ac.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/PI%3AH3K4me3%2073%2078/bigWig?gbKey=1tvb0fa2 regions.bed SRS415769_ZD4_273_768_H3K4me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/AN%3AH3K36me3%2093/bigWig?gbKey=1tvb0fa2 regions.bed SRS167291_93_453_H3K36me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/MCD34%3AH3K4me1%2062/bigWig?gbKey=1tvb0fa2 regions.bed SRS114975_RO_01562_488_H3K4me1.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/MCD34%3AH3K4me3%2049/bigWig?gbKey=1tvb0fa2 regions.bed SRS114974_UW_RO_01549_488_H3K4me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/CD56%3AH3K4me3%2001%2015/bigWig?gbKey=1tvb0fa2 regions.bed SRS183512_RO_01701_718_H3K4me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/PBM%3AH3K27me3%2010%2021/bigWig?gbKey=1tvb0fa2 regions.bed SRS118706_TC010_769_H3K27me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/H1%3AH3K27me3%2039/bigWig?gbKey=1tvb0fa2 regions.bed SRS004524_Solexa-8039_424_H3K27me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/PFK%3AH3K27me3%2001%2037/bigWig?gbKey=1tvb0fa2 regions.bed SRS167239_skin01_473_H3K27me3_rep1.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/PFK%3AH3K27me3%2001/bigWig?gbKey=1tvb0fa2 regions.bed SRS167239_skin01_473_H3K27me3_rep2.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/SC%3AH3K36me3%2003%2078/bigWig?gbKey=1tvb0fa2 regions.bed SRS306624_STL003_469_H3K36me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/MCD34%3AH3K36me3%2036%2057/bigWig?gbKey=1tvb0fa2 regions.bed SRS114973_RO_01536_488_H3K36me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/HUES64%3AH3K36me3%20e3/bigWig?gbKey=1tvb0fa2 regions.bed SRS167275_Lib:MC:20101025:03--ChIP:MC:20101020:03:ES_cell_HUES_64:H3K36Me3_528_H3K36me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/Spleen%3AH3K4me1%2002%2017/bigWig?gbKey=1tvb0fa2 regions.bed SRS366520_STL002_471_H3K4me1.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/CCCTmem%3AH3K4me3%2032%2064/bigWig?gbKey=1tvb0fa2 regions.bed SRS360526_332_449_H3K4me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/SC%3AH3K27me3%2001%2030/bigWig?gbKey=1tvb0fa2 regions.bed SRS309358_STL001_469_H3K27me3.bwaob
bigWigAverageOverBed -minMax http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/FB%3AH3K4me1%2001%2085/bigWig?gbKey=1tvb0fa2 regions.bed SRS167231_HuFNSC01_645_H3K4me1.bwaob



Part of the output:

processing chromosomes
processing chromosomes
processing chromosomes
processing chromosomes
processing chromosomes
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/BCG%3AH3K4me3%2012/bigWig?gbKey=1tvb0fa2 @115507200 (got 0 bytes)
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/BAG%3AH3K36me3%2012%2038/bigWig?gbKey=1tvb0fa2 @124035072 (got 0 bytes)
processing chromosomes
processing chromosomes
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/CSM%3AH3K4me3%2083%2024/bigWig?gbKey=1tvb0fa2 @104996864 (got 0 bytes)
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/PI%3AH3K4me1%2013%2068/bigWig?gbKey=1tvb0fa2 @95985664 (got 0 bytes)
processing chromosomesunable to fetch 253952 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/AN%3AH3K4me3%2093/bigWig?gbKey=1tvb0fa2 @93134848 (got 0 bytes)
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/NGED%3AH3K27me3%2002/bigWig?gbKey=1tvb0fa2 @116924416 (got 253952 bytes)
processing chromosomes
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/Thymus%3AH3K9me3%2001%2052/bigWig?gbKey=1tvb0fa2 @100532224 (got 253952 bytes)
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/CM%3AH3K4me3%2055%2066/bigWig?gbKey=1tvb0fa2 @90529792 (got 253952 bytes)
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/Esophagus%3AH3K36me3%2002%2077/bigWig?gbKey=1tvb0fa2 @91578368 (got 253952
bytes)
processing chromosomesunable to fetch 262144 bytes from http://genboree.org/REST/v1/grp/Epigenomics%20Roadmap%20Repository/db/Release%209%20Repository/trk/FIL%3AH3K36me3%2095%2054/bigWig?gbKey=1tvb0fa2 @79708160 (got 253952 bytes)

After rerunning the command the offset might have changed (was able to retrieve that chunk completely) and might fail on the next one (or will finally succeed completely).

A lot of times the same number of bytes pop up:
  - got 253952 bytes
  - got 0 bytes

It is weird that it is able to fetch most of the chunk except the last 8k:
    262144 − 253952 = 8192


Many thanks,

Gert

>>   * Subscribe: Email genome-announce+subscribe@soe.ucsc.edu
>>   * Unsubscribe: Email genome-announce+unsubscribe@soe.ucsc.edu
Reply all
Reply to author
Forward
0 new messages