wig to bigwig error

436 views
Skip to first unread message

Sundaresan,Varsha

unread,
Feb 22, 2018, 11:48:25 AM2/22/18
to gen...@soe.ucsc.edu

Hello!

        I am trying to convert wig files from - ftp://ftp.flybase.org/flybase/associated_files/RNA-seq/modencode_30devstages/  to bigwig format using the UCSC kent module wigToBigWig command and I get the error - hashMustFindVal: 'chr2CEN' not found and the same for a few other chr not matching the chrom.sizes file. 

I was able to grep out the chrom values that did not match the chrom.sizes file using - grep -vE. Now get the error - Please remove overlaps and try again. Please let me know if there is a way to remove these chr locations not matching the chrom.sizes file using wigToBigWig, thank you.


Best,

Varsha Sundaresan, M.S.
Doctoral Candidate| 
Lab of Dr. Lei Zhou
President, Organization for Graduate Student Advancement and Professional Development (OGAP)
College of Medicine, University of Florida

Cath Tyner

unread,
Feb 23, 2018, 1:45:17 PM2/23/18
to Sundaresan,Varsha, gen...@soe.ucsc.edu
Hello Varsha,

Thank you for contacting the UCSC Genome Browser support team. The issue reported by the wigToBigWig program implies that you might have multiple data values associated with the same chromosome location in your wiggle file. The wigToBigWig utility doesn't know which of the two data values to use for that position, so it reports an error. For example it's possible that your wiggle file contains two entries for a particular position, and it's further possible that the conflicting lines are located in different parts of the wiggle file, which makes it difficult to see the overlap. You will need to remove the overlapping entries for wigToBigWig to run successfully. This may involve a scripted solution, which is generally beyond the scope of this mailing list. 

Another possibility is one that might not apply, but I thought I would mention it in the slight chance that it's helpful:

Since you mentioned this:
I was able to grep out the chrom values that did not match the chrom.sizes file using - grep -vE

If you perhaps literally only removed the lines which contained each chrom you did not want, then you have removed only the lines such as 

variableStep chrom=chr2CEN

which is followed by the data points for chr2CEN, such as:

40218 1
40293 0
46701 1
46777 0
78095 1

With the chrom definition for this grouping of data points removed, wigToBigWig will search for the available chrom name above the data set. In a situation like this, you would be defining all of those chr2CEN data points with the wrong chrom (whatever chrom is defined above that set), and there would likely be repeated regions if you inadvertently consolidated those data points into the above chrom's set. 

Here's more information about the wiggle format:

This help page shows a simple example like this:

variableStep chrom=chr2
300701 12.5
300702 12.5
300703 12.5
300704 12.5
300705 12.5

If you had data like this:

variableStep chrom=chr2
300701 12.5
300702 12.5
300703 12.5
300704 12.5
300705 12.5
variableStep chrom=chr3
300701 12.5
300702 12.5
300703 12.5
300704 12.5
300705 12.5

...and you then removed the line "variableStep chrom=chr3", you would now have data like this (below), where all of the data points are now duplicates (same region is listed twice for one chrom). 

variableStep chrom=chr2
300701 12.5
300702 12.5
300703 12.5
300704 12.5
300705 12.5
300701 12.5
300702 12.5
300703 12.5
300704 12.5
300705 12.5

As you move forward, please feel free to respond to this forum at any time if our support team can provide further assistance, and please always feel free to search our mailing list archives for related posts.

Thank you for contacting the UCSC Genome Browser support team. 
Please send new and follow-up questions to one of our mailing lists below:

  * Post to the Public Help Forum: E
mail 
gen...@soe.ucsc.edu
​ or search the Public Archives
​  * Post to the Mirror Help Forum: Email
 
genome...@soe.ucsc.edu 
or search the Mirror Archives​
​  * Confidential/private help: Email
 
genom...@soe.ucsc.edu

Join us on Social Media! FacebookTwitter, Wordpress BlogYouTube
UCSC Genome Browser Announcements List (for new data & software)
Request on-site training & workshops at your institution

​Enjoy,​
Cath
. . .
Cath Tyner
UCSC Genome Browser, Software QA & User Support
UC Santa Cruz Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/DM5PR2201MB156392B67630FC8539508C4F81CD0%40DM5PR2201MB1563.namprd22.prod.outlook.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Sundaresan,Varsha

unread,
Feb 26, 2018, 4:21:57 PM2/26/18
to Cath Tyner, gen...@soe.ucsc.edu

Hi Cath,

     Thank you for your response, I see the problem now. Would you happen to know if these files -  ftp://ftp.flybase.org/flybase/associated_files/RNA-seq/ are from the genome build dm3 or dm6? 

Thank you!


Best,

Varsha Sundaresan, M.S.
Doctoral Candidate| 
Lab of Dr. Lei Zhou
President, Organization for Graduate Student Advancement and Professional Development (OGAP)
College of Medicine, University of Florida

From: Cath Tyner <ca...@ucsc.edu>
Sent: Friday, February 23, 2018 1:45:13 PM
To: Sundaresan,Varsha
Cc: gen...@soe.ucsc.edu
Subject: Re: [genome] wig to bigwig error
 

Jairo Navarro Gonzalez

unread,
Feb 28, 2018, 3:05:25 PM2/28/18
to Sundaresan,Varsha, gen...@soe.ucsc.edu

Hello Varsha,

Thank you for using the UCSC Genome Browser and your inquiry.

This mailing list is intended to provide support for questions related to the use of the UCSC Genome Browser and utilities. For more information about these RNA-seq files, you should contact FlyBase:

http://flybase.org/contact/email

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro 
UCSC Genomics Institute

Want to share the Browser with colleagues?
Host a workshop: http://bit.ly/ucscTraining



Reply all
Reply to author
Forward
0 new messages