BED file format

125 views
Skip to first unread message

Kasoji, Manjula (NIH/NCI) [C]

unread,
May 28, 2015, 12:19:37 PM5/28/15
to gen...@soe.ucsc.edu, Kasoji, Manjula (NIH/NCI) [C]
Hi I'm trying to load my BED file to the custom tracks page and I'm receiving this error:

Error File 'biomart_rn5_ensembl.gtf_formatted.bed' - invalid unsigned integer: "chromStart"

Here is a portion of by BED file:


FR-N-S071300:rn5_ensembl kasojimd$ head biomart_rn5_ensembl.gtf_formatted.bed
chrom chromStart chromEnd strand AssociatedGeneName EnsemblGeneID GOTermName GOdomain GOTermAccession
chr11 1794870 1799670 - MGC95208 ENSRNOG00000000720 apoptotic process biological_process GO:0006915
chr4 7120979 7126436 - Chpf2 ENSRNOG00000010466 metabolic process biological_process GO:0008152
chr4 7120979 7126436 - Chpf2 ENSRNOG00000010466 "transferase activity, transferring hexosyl groups" molecular_function GO:0016758
chr4 7120979 7126436 - Chpf2 ENSRNOG00000010466 Golgi cisterna membrane cellular_component GO:0032580
chr4 7120979 7126436 - Chpf2 ENSRNOG00000010466 "transferase activity, transferring glycosyl groups" molecular_function GO:0016757
chr4 7120979 7126436 - Chpf2 ENSRNOG00000010466 membrane cellular_component GO:0016020
chr4 7120979 7126436 - Chpf2 ENSRNOG00000010466 acetylgalactosaminyltransferase activity molecular_function GO:0008376
chr14 6653086 6658917 - Spp1 ENSRNOG00000043451 osteoblast differentiation biological_process GO:0001649
chr14 6653086 6658917 - Spp1 ENSRNOG00000043451 cytokine activity molecular_function GO:0005125

Any insight on how to fix my BED file will be appreciated. I would like to keep the extra columns because I will be doing a bed intersect with another file and would like the annotations to remain.

Thanks,

Manjula

Kasoji, Manjula (NIH/NCI) [C]

unread,
May 28, 2015, 12:21:04 PM5/28/15
to Kasoji, Manjula (NIH/NCI) [C], gen...@soe.ucsc.edu
I just noticed that the bed file print out did not paste as tab-delimited. The file is indeed tab-delimited.

Thanks,

Manjula

From: <Kasoji>, "Manjula [C] (NIH/NCI)" <manjula...@nih.gov<mailto:manjula...@nih.gov>>
Date: Thursday, May 28, 2015 11:30AM
To: "gen...@soe.ucsc.edu<mailto:gen...@soe.ucsc.edu>" <gen...@soe.ucsc.edu<mailto:gen...@soe.ucsc.edu>>
Cc: "Kasoji, Manjula (NIH/NCI) [C]" <manjula...@nih.gov<mailto:manjula...@nih.gov>>
Subject: BED file format

Hi I'm trying to load my BED file to the custom tracks page and I'm receiving this error:

Error File 'biomart_rn5_ensembl.gtf_formatted.bed' - invalid unsigned integer: "chromStart"

Here is a portion of by BED file:


FR-N-S071300:rn5_ensembl kasojimd$ head biomart_rn5_ensembl.gtf_formatted.bed
chrom chromStartchromEnd strandAssociatedGeneName EnsemblGeneID GOTermNameGOdomain GOTermAccession
chr11 17948701799670 -MGC95208 ENSRNOG00000000720apoptotic process biological_process GO:0006915
chr4 71209797126436 -Chpf2 ENSRNOG00000010466metabolic process biological_process GO:0008152
chr4 71209797126436 -Chpf2 ENSRNOG00000010466"transferase activity, transferring hexosyl groups"molecular_function GO:0016758
chr4 71209797126436 -Chpf2 ENSRNOG00000010466Golgi cisterna membranecellular_component GO:0032580
chr4 71209797126436 -Chpf2 ENSRNOG00000010466"transferase activity, transferring glycosyl groups"molecular_function GO:0016757
chr4 71209797126436 -Chpf2 ENSRNOG00000010466membrane cellular_componentGO:0016020
chr4 71209797126436 -Chpf2 ENSRNOG00000010466acetylgalactosaminyltransferase activitymolecular_function GO:0008376
chr14 66530866658917 -Spp1 ENSRNOG00000043451osteoblast differentiationbiological_process GO:0001649
chr14 66530866658917 -Spp1 ENSRNOG00000043451cytokine activity molecular_function GO:0005125

Steve Heitner

unread,
May 28, 2015, 1:25:58 PM5/28/15
to Kasoji, Manjula (NIH/NCI) [C], gen...@soe.ucsc.edu
Hello, Manjula.

I assumed that it was the header in your BED file that was causing the problem. When I stripped out the header, the "unsigned integer" error did indeed go away, but I was presented with a new error: "Expecting number field 5 line 1 of custom track, got MGC95208"

The custom track tool accepts a number of data formats which are listed at the top of the custom track tool page. For BED format, at least, the custom track tool is expecting to see data formatted as described on the BED format page at http://genome.ucsc.edu/FAQ/FAQformat.html#format1. Your data is currently:

chr11 1794870 1799670 - MGC95208 ENSRNOG00000000720 apoptotic process biological_process GO:0006915

It should be something like:

chr11 1794870 1799670 MGC95208 0 -

if you want to preserve all of the additional annotation information you have, you can get creative with the fourth column. Columns 7-12 are optional and deal with coloration and the display of introns and exons in your BED file.

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. Questions sent to that address will be archived in a publicly-accessible forum for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group
--


Reply all
Reply to author
Forward
0 new messages