Error File 'OFD1_only_Clinvar05022015_only selected pathogenic_customtrack.txt' - Unrecognized format line 6 of file: chrX 13753396 13753398 delAG . 0 NM_003611.2(OFD1):c.43_44delAG (p.Gln16Argfs) deletion 8481 OFD1 Pathogenic 312262806 RCV000034023 GeneReviews:NBK1188,MedGen:C1510460,OMIM:311200,Orphanet:ORPHA2750,SNOMED CT:52868006 not provided classified by single submitter 1 (note: chrom names are case sensitive, e.g.: correct: 'chr1', incorrect: 'Chr1', incorrect: '1')
Fachärztin für Humangenetik
M.Sc. Bioinformatik
Klinische Genetik der
Universitätsklinik für Kinder- und Jugendheilkunde
Gemeinnützige Salzburger Landeskliniken Betriebs. GmbH
Paracelsus Medizinische Universität
Müllner Hauptstraße 48
A-5020 Salzburg
Tel.: +43-662-4482-58788
Sekretariat Genetik: +43(0)662 4482-2605
Fax: +43(0)662 4482-2621
Email: i.b...@salk.at | www.salk.at
Jonathan Casper <jca...@soe.ucsc.edu>: Feb 06 05:06PM -0800
Hello Ajay, There are a lot of data in your .xls file, and it's not immediately clear which pieces you are trying to graph. Are you trying to plot the values in the "logRatio" row against the names and positions that appear in the SystematicName row? If so, the spreadsheet program that you are using to read the .xls file probably has a "graph" (or "chart") tool that will be much easier for you to use. The bedGraph format is generally used for graphing a value that changes along a single chromosome, not for comparing different chromosomes. We do have a Genome Graphs tool ( http://genome.ucsc.edu/cgi-bin/hgGenome) that displays data on multiple chromosomes at once, but it is also designed to show values that change over the span of each chromosome. Genome Graphs would not be the right tool when associating a single unchanging value with each chromosome. If your problem instead is that you are unable to read the .xls file, we suggest that you search online for software that will allow you to open it. Without making any particular recommendations, among the software that will do this are Microsoft Excel, LibreOffice, and Google Docs (an online service). You may also be interested in posting your question on a more general bioinformatics discussion site like https://www.biostars.org/. This mailing list is devoted to questions regarding the use of the UCSC Genome Browser and its tools; questions on how to interpret TCGA data are a bit outside of our scope. I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu. -- Jonathan Casper UCSC Genome Bioinformatics Group |
Eric Foss <ef...@fredhutch.org>: Feb 06 02:30PM -0800
Dear UCSC Genome Browser, I would like to download all of the ENCODE transcription factor binding data in the vicinity of a gene I’m interested in. Is this possible with the Table Browser or some other UCSC Genome Browser tool? Thank you. Eric |
Brian Lee <bria...@soe.ucsc.edu>: Feb 06 04:43PM -0800
Dear Eric, Thank you for using the UCSC Genome Browser and your question about downloading all of the ENCODE transcription factor binding data in the vicinity of a gene of interest. You can use the Table Browser to access this information. For example, here is a session where an example region of interest is highlighted near the start of a gene, SIRT1. Below are steps to acquire the Transcription Factor data for this region, chr10:69,637,000-69,639,000. http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=Brian%20Lee&hgS_otherUserSessionName=hg19.SIRT1.TFBS.NFYB First go to the Table Browser, http://genome.ucsc.edu/cgi-bin/hgTables and set the "Group:" to "Regulation", the "track:" to "Txn Factor ChIP" and the "table:" to "wgEncodeRegTfbsClusteredV3". Then select "position" and enter coordinates of interest: chr10:69,637,000-69,639,000. By clicking "get output" you will see the following output: #bin chrom chromStart chromEnd name score expCount expNums expScores 1116 chr10 69637728 69638048 NFYB 154 1 517 154 1116 chr10 69637926 69638250 CEBPB 430 4 212,343,426,477 275,262,218,430 1116 chr10 69638035 69638275 USF1 139 1 161 139 The chrom, chromStart, and chromEnd fields give the regions where named transcription factors like NFYB have been seen, and the score gives a relative indication of the strength of the signal seen in experiments, while expCount indicates the number of experiments binding has been observed. What you will notice in this session is that this wgEncodeRegTfbsClusteredV3 represents a processed summarized condensation of hundreds of ChIP-seq experiments. If you are interested in looking deeper into the underlying files that produced the clustered summary, you can click the boxes, such as the one for NFYB, and then click the "metadata" link for "more info". There you will see the lab, antibody, and cell type and the uniform processed peak track, wgEncodeAwgTfbsSydhK562NfybUniPk, that was used in a clustering algorithm to generate the clusters track. You will also see a UCSC Accession, wgEncodeEH002024, when looking at metadata details of cluster items. You can use the accessions like wgEncodeEH002024 to also find the underlying raw signal track, if desired. For example, with the Track Search tool, http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hgt_tSearch=1&tsCurTab=simpleTab&tsSimple=wgEncodeEH002024, you click search and then click a similar blue metaData arrows next to the "K562 NF-YB Standard ChIP-seq Signal from ENCODE/SYDH " line to see a displayed "fileName" such "wgEncodeSydhTfbsK562NfybStdSig.bigWig", which you can download for the entire genome. Conversely, if you want this signal data for only your region, you can return to the Table Browser, set the "Group" to "All Tables", "table:" "wgEncodeSydhTfbsK562NfybStdSig" and "position:" chr10:69635813-69645132 and "get output" as "data points". In summary, the Table Browser output from wgEncodeRegTfbsClusteredV3 ( http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegTfbsClusteredV3) provides a processed clustered coordinate condensation of hundreds of uniformly processed ChIP-seq files (see http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeAwgTfbsUniform for details), that were in turn generated from separate laboratories for various cell lines ( http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeTfBindingSuper). To read more about the background of these data sources please see the related Track Description Pages in this paragraph. Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu. All the best, Brian Lee UCSC Genome Bioinformatics Group |
Pat Hartz <pha...@peas.welch.jhu.edu>: Feb 06 02:59PM -0500
I have received ‘funny’ results of BLAT mapping on several occasions and I want to understand what the results mean. For example, I BLAT’d the sequence of MIR22-5p (AGUUCUUCAGUGGCAAGCUUUA) and, in addition to the results that nailed the mapping to reference ch17:1713952-1713973, I received a result that included “17_KI270867v1 alt”. How do I interpret the second result? Thank you, Pat Hartz Patricia A. Hartz, PhD Science Writer, OMIM (www.omim.org) Institute of Genetic Medicine Johns Hopkins University |
Matthew Speir <msp...@soe.ucsc.edu>: Feb 06 03:04PM -0800
Hi Pat, Thank you for your question about your BLAT results. The chr*_alt chromosomes are alternative sequences for different regions in the human genome. You can read more about them on the GRCh38/hg38 Gateway page, http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg38, under the section titled " GRCh38 Highlights". You can also find more information on these alternate loci on the Genome Reference Consortium (GRC) website: http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/definitions.shtml#ALTERNATE. You can see what regions these alternate sequences correspond to in the genome by using the "Alt Map Super-track", http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=altSequence. I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu. Matthew Speir UCSC Genome Bioinformatics Group On 2/6/15 11:59 AM, Pat Hartz wrote: |
Matthew Speir <msp...@soe.ucsc.edu>: Feb 06 02:24PM -0800
Hi Anaïs, Thank you for your question about . You can look at the following GenomeWiki page for some sample scripts that you can use to create your own Reciprocal Best or Syntenic Net file: http://genomewiki.ucsc.edu/index.php/HowTo:_Syntenic_Net_or_Reciprocal_Best. You should be able to find any of the utilities referenced in those scripts on our download server at http://hgdownload.soe.ucsc.edu/downloads.html under the appropriate folder for your machine. If you are ever curious about what a particular UCSC Genome Browser utility does, you can always run it on the command line without any arguments to see that usage message. For example, if you run chainStichId without any arguments, you should see the following usage message: chainStitchId - Join chain fragments with the same chain ID into a single chain per ID. Chain fragments must be from same original chain but must not overlap. Chain fragment scores are summed. usage: chainStitchId in.chain out.chain I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu. Matthew Speir UCSC Genome Bioinformatics Group On 2/5/15 9:00 AM, Anaïs Gouin wrote: |
Da-Peng Wang <wang...@gmail.com>: Feb 06 06:26PM
Dear Colleague, I intend to convert BAM files to WIG files for UCSC genome browser as we don't have webserver to store bigwig files at the moment. Would you like to help me how to make the wig files that can be used in UCSC? Thank you in advance, Dapeng |
Hiram Clawson <hi...@soe.ucsc.edu>: Feb 06 10:52AM -0800
Good Morning Dapeng: There are many procedures to construct such files from BAM files. A google search for this procedure will find many such examples. Here is another example, using the bedtools 'bamToBed' operation and kent source commands: bamToBed -i yourFile.bam | cut -f1-3 | sort -k1,1 -k2,2n \ bedItemOverlapCount -chromSize=yourGenome.chrom.sizes test stdin \ | sort -k1,1 -k2,2n > yourFile.bedGraph bedGraphToBigWig yourFile.bedGraph yourGenome.chrom.sizes yourFile.bw --Hiram On 2/6/15 10:26 AM, Da-Peng Wang wrote: |
Hiram Clawson <hi...@soe.ucsc.edu>: Feb 06 12:45PM -0800
Good Afternoon Da-Peng: If your files are large, the upload is inefficient and will probably not work. You can load tiny wiggle files, a few thousand data points at most, but they are volatile and will disappear. I would recommend finding a WEB hosting service where you can supply large files with a URL. Please review your options for graphing data: http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format http://genome.ucsc.edu/goldenPath/help/bigWig.html http://genome.ucsc.edu/goldenPath/help/wiggle.html --Hiram On 2/6/15 12:37 PM, Da-Peng Wang wrote: |
Da-Peng Wang <wang...@gmail.com>: Feb 06 08:37PM
Hi Hiram, Thank you for your reply. But we don't have webserver to store bigwig files and "bedGraphToBigWig" is unable to generate the WIG file (not bigwig). We hope to upload the wig file to UCSC browser from PC. Could you please help me more? Thanks, Regards, Dapeng |
Brian Lee <bria...@soe.ucsc.edu>: Feb 06 09:43AM -0800
Dear Elisabetta, Thank you for using the UCSC Genome Browser and your question about ChIP-seq ENCODE scores. You are correct to think of interpreting the darker score as increased biological evidence of binding of that transcription factor at that particular spot. Here is a session that displays the Clustered Transcription Factor Binding Sites track (wgEncodeRegTfbsClusteredV3), and the underlying Uniform Peaks track (wgEncodeAwgTfbsUniform) used to create the clusters, produced by the ENCODE Analysis Working Group. http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=Brian%20Lee&hgS_otherUserSessionName=hg19.TFBS.Uniform.Cluster In summary the AWG created a pipeline to uniformly processes several hundred ChIP-seq files generated by the ENCODE project. That uniform processing resulted in a comparable signal scores viewable in the wgEncodeAwgTfbsUniform track, that was then used to generate the clustered score in the wgEncodeRegTfbsClusteredV3 track, where a normalization factor was used to attempt to better distribute scores evenly. In the above session just the factors JUN, JUNB, JUND, and MYC have been filtered to display. You can see how MYC has a dark score and has several letters following the block, indicating all the cell types where binding of MYC has been observed. If you click into the box for the MYC cluster and you will see the list of assays where evidence shows there is binding. Returning to the Browser display you can see several individual "Uniform ...c-Myc" tracks displayed below the clusters track. Those are the separate wgEncodeAwgTfbsUniform tracks used to generate the processed clustered summary wgEncodeRegTfbsClusteredV3 track for this MYC cluster. Those individual uniform processed scores were used to create the cluster score given to the the MYC cluster. Like the MYC factor, you can also click the JUN factors and you will see there is only one observed cell type where this data indicates this factor binds at this location. And similarly below, you will see the "Uniform... Jun" tracks that contributed to the clusters track. Also note that some of the transcription factors, like MYC, also have additional Factorbook motif information available to display, you can read more about that in the wgEncodeRegTfbsClusteredV3 track description. For a complete understanding of how the scores were calculated you must read the Track Description pages for these two tracks. See Methods section http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegTfbsClusteredV3 See Methods section and Peak Calling: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeAwgTfbsUniform If you have more questions after reviewing the track description pages about how the score is calculated, I suggested reviewing our mailing list archive of previously answered questions: https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!searchin/genome/score$20tfbs Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu. All the best, Brian Lee UCSC Genome Bioinformatics Group |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to genome+un...@soe.ucsc.edu. |
Hello, Ingrid.
The ClinVar Variants track is based on a bigBed file. I assume you obtained your text file by downloading the contents of the ClinVar Variants track via the Table Browser. The problem here is that the bigBed file that the ClinVar Variants track is based on was created from a non-standard BED file created using an AutoSql definition file. Because of this, when you obtain the original BED data and try to load it as a custom track, it will not work.
If you would like to create your own custom track based on the format of the ClinVar Variants track, the best thing to do would be to create your own AutoSql definition file (see Example Three at http://genome.ucsc.edu/goldenPath/help/bigBed.html) and use the bedToBigBed utility (http://hgdownload.cse.ucsc.edu/admin/exe/) to create a bigBed file. I will also attach the clinvar.as file here for you.
Please contact us again at gen...@soe.ucsc.edu if you have any further questions. Questions sent to that address will be archived in a publicly-accessible forum for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.
---
Steve Heitner
UCSC Genome Bioinformatics Group
--