Problem with running gff2bed

407 views
Skip to first unread message

Thijmen B

unread,
Apr 14, 2020, 1:24:03 PM4/14/20
to bedops-discuss
Hi all,

For a bit I am trying to run gff2bed, and the issue is I think extremely simple but I cannot figure out what I did wrong.

I have downloaded the pre-built package:
 bedops_linux_x86_64-v2.4.39.tar.bz2

 
And using 
tar jxvf bedops_linux_x86_64-v2.4.39.tar.bz2

created a bin folder from which I would like to run the programs from.
However, running the gff2bed file as follows:
./gff2bed < myfile.gff3 > newfile.bed

resulted in:

./gff2bed: line 132: convert2bed: command not found


However, convert2bed is also in the folder. Is there perhaps a common issue I missed. Did I do something wrong unpacking?


Thanks a lot for any help.


Thijmen

Alex Reynolds

unread,
Apr 14, 2020, 1:49:59 PM4/14/20
to bedops-discuss
Either add the path to that folder to your environment PATH variable, or copy the contents of the bin folder to /usr/local/bin. Please see: https://bedops.readthedocs.io/en/latest/content/installation.html#linux

Thijmen B

unread,
Apr 15, 2020, 5:43:00 AM4/15/20
to bedops-discuss
Hi Alex,

Thanks for this! I must have made a mistake when I tried to add the directory to my path earlier because than it didn't work and I didn't understood why.

Now, while running the program I still run into an error:

$ gff2bed < Hzea.gff3 > Hzea.bed


Error on line 143 in -. Genomic end coordinate is less than (or equal to) start coordinate.

Error: Stage [Sorted BED to stdout] failed -- exit status [256 | 1]


Which is strange because the end coordinates of line 143 is not less or equal to the start coordinates.

scaffold_281 Scipio CDS 252987 253087 0.889 - 0 Parent=996189


Maybe I am overlooking something? Could anyone point me to the right direction?

Thanks a lot in advance.

Op dinsdag 14 april 2020 19:49:59 UTC+2 schreef Alex Reynolds:

Alex Reynolds

unread,
Apr 15, 2020, 6:37:18 AM4/15/20
to bedops-discuss
Thanks for posting a link to the original file. 

Here's one way you might debug and fix this issue.

The first step is to make an unsorted BED file, by using the `--do-not-sort` option:

$ gff2bed --do-not-sort < HzOGS2-15205-fixed_note-added.gff3 > HzOGS2-15205-fixed_note-added.gff3.unsorted.bed

The next step is to fix coordinates where the start coordinate is larger than or equal to the stop coordinate (which the BED format does not usually allow):

$ awk -v FS="\t" -v OFS="\t" '{ if ($2 > $3) { t = $2; $2 = $3; $3 = t; } else if ($2 == $3) { $3 = $3 + 1; } print $0; }' HzOGS2-15205-fixed_note-added.gff3.unsorted.bed > HzOGS2-15205-fixed_note-added.gff3.unsorted.fixed.bed

The final step is to sort the unsorted output:

$ sort-bed HzOGS2-15205-fixed_note-added.gff3.unsorted.fixed.bed > HzOGS2-15205-fixed_note-added.gff3.bed

Sorted output is easier to use with set operation tools.

I am posting these steps to show how to debug data. To streamline things, these steps can be done in one coordinated step, without creating any wasteful intermediate files:

$ gff2bed --do-not-sort < HzOGS2-15205-fixed_note-added.gff3 | awk -v FS="\t" -v OFS="\t" '{ if ($2 > $3) { t = $2; $2 = $3; $3 = t; } else if ($2 == $3) { $3 = $3 + 1; } print $0; }' | sort-bed - > HzOGS2-15205-fixed_note-added.gff3.bed

Please let me know if you run into any other issues.

Regards,
Alex
Reply all
Reply to author
Forward
0 new messages