Intersect naming convention warnings - no output with nonamecheck

850 views
Skip to first unread message

hmv...@gmail.com

unread,
Jul 28, 2017, 9:27:25 AM7/28/17
to bedtools-discuss
Hi,

I have a curious problem with intersecting .bed and .gff files. I have  .bed file with over 522000 sites I want to intersect with genome annotation. I've used GenomicFeatures R package to produce 5 different .gff files with different regions of interest. The .bed and the .gff files are in the same genomic order (and the genome has chromosomes as well as lot of scaffolds). I have managed to run my full .bed file for all but one of the .gff files and for this last .gff file I get an error message for my .bed file. There are warnings of naming conventions but running with -nonamecheck does not solve the issue for this one combination of .bed and .gff. The two types of warning messages I have gotten while running the .bed against the 5 .gff files

"WARNING: File input.bed has inconsistent naming convention for record "      -> which results in no output file created even though there is known overlapp
" WARNING: File input.bed has a record where naming convention (leading zero) is inconsistent with other files"       -> which results in an correct output file for 4 of the .gff files

What is the difference between these two warning messages for creating the output file (as the -nonamecheck has no influence)?
What is the difference in their meaning depending on are they are given for the -a versus the -b file?
Is there an upper limit for different chromosome/scaffold naming types allowed for an input file in bedtools?

I find it weird that I get the failing error for my .bed file for only 1/5 .gff files. If I subset the .bed file file to contain the row which triggers the warning and the rows above and after it (max 2 different types of naming conventions) I am able to produce an output file only if I also subset the and the .gff to contain the same regions. Running the subset .bed file against the whole .gff also results in "inconsistent naming convention for record" the .gff .

I'm using bedtools v2.26.0. Below are the commands and their output. I'll attach here also the subset input files and the annotation gff.

> intersectBed -a GPrank_chr3fix_methNo_cov10_TSM_CHRforKEES.bed -b ./Annotation_Kees/tss_region_fixed.txt -wa -wb > GPrank_TSM_KEES_tss_region.bed
***** WARNING: File GPrank_chr3fix_methNo_cov10_TSM_CHRforKEES.bed has inconsistent naming convention for record:
JRXK01020103.1    57380    57381    0.093984215    0.867795505    0.77381129

***** WARNING: File GPrank_chr3fix_methNo_cov10_TSM_CHRforKEES.bed has inconsistent naming convention for record:
JRXK01020103.1    57380    57381    0.093984215    0.867795505    0.77381129


> intersectBed -a test_methSites_problem_2.bed -b test_KEES_tssregion_problem_2.bed -wa -wb > test_result_problem_2.bed
***** WARNING: File test_KEES_tssregion_problem_2.bed has a record where naming convention (leading zero) is inconsistent with other files:
chrLGE22    rtracklayer    tss_region    27593    27942    0    +    .    name=Genbank:XM_015652078.1

***** WARNING: File test_KEES_tssregion_problem_2.bed has a record where naming convention (leading zero) is inconsistent with other files:
chrLGE22    rtracklayer    tss_region    27593    27942    0    +    .    name=Genbank:XM_015652078.1


> intersectBed -a test_methSites_problem_2.bed -b ./Annotation_Kees/tss_region_fixed.txt -wa -wb > test_result_problem_3.bed
***** WARNING: File test_methSites_problem_2.bed has inconsistent naming convention for record:
JRXK01020103.1    57380    57381    0.093984215    0.867795505    0.77381129

***** WARNING: File test_methSites_problem_2.bed has inconsistent naming convention for record:
JRXK01020103.1    57380    57381    0.093984215    0.867795505    0.77381129

All ideas are wellcome!
Heidi



input_examples.zip

Aaron Quinlan

unread,
Aug 11, 2017, 9:59:57 PM8/11/17
to hmv...@gmail.com, bedtools...@googlegroups.com
Hi Heidi,

I am sorry to be just now getting to this inquiry.  I tried to open your .zip file but it appears to be empty.  Have you resolved this problem?  If not, please send me a direct email with the relevant files so I can look into this for you.

Best,
Aaron
--
You received this message because you are subscribed to the Google Groups "bedtools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bedtools-discu...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages