Re: [bedtools-discuss] Using bedtools on bgzip and tabix indexed vcf files

605 views
Skip to first unread message

Aaron Quinlan

unread,
Jun 25, 2012, 9:30:54 PM6/25/12
to bedtools...@googlegroups.com
Hi Nan,

Bgzip is a (very clever) dialect of gzip. Bedtools has worked with gzipped files for quite some time and what you are seeing is bedtools detecting your files as gzip files. It is not, however, using the tabix functionality in any way.

Best,
Aaron



On Jun 25, 2012, at 9:25 PM, Nan wrote:

> Hi All,
>
> Even though bedtools previous manual and --help message did not mention that bedtools can deal with bgzip and tabix indexed vcf files. I did a quick test on 'bedtools intersect' using bgzip and tabix indexed vcf files. It seems working. I just wonder whether I should use this feature or this feature still in development (not recommend to use at this time). Any suggestions?
>
> Thanks a lot,
>
> Nan
>
> Nan Leng
>
> nan....@personlis.com
>

Nan Leng

unread,
Jun 25, 2012, 9:38:27 PM6/25/12
to bedtools...@googlegroups.com
Hi Aaron,

Thanks for important information and your quick reply. Based on my limited quick testing. Using tabix after bgzip, the 'bedtools intersect' speed is 30-40% faster than just using bgzip. I will test more when I get a chance and will keep everybody inform.

Thanks,

Nan

Aaron Quinlan

unread,
Jun 25, 2012, 9:53:41 PM6/25/12
to bedtools...@googlegroups.com
Hi Nan,

Depending on the file size, this is to be expected.  Tabix gives up a bit of speed because it is querying file on disk.  Bedtools loads the B file into memory.  Where bedtools may have better speed, tabix will use ~0 memory.  

Best,
Aaron

Reply all
Reply to author
Forward
0 new messages